The Detection of Scams on the Ethereum Blockchain#
Academic Insight
Key Insights
Ethereum harbours an array of scams, spanning phishing, Ponzi schemes, and pump-and-dumps.
Identifying and countering these scams pose intricate challenges, encompassing systematic analysis, accurate feature extraction, and prompt detection.
Novel strategies emerge: trans2vec leveraging transaction to tackle phishing, SADPonzi deciphering bytecode for Ponzi schemes, and LightGBM incorporating N-gram features to foresee early-stage honeypots.
Predicting rug pulls involves assessing the pool state, token distribution of users, and forecasting the coin being pumped relies on sophisticated market movement information.
Introduction#
Ethereum is famous as the largest blockchain platform that supports smart contracts, which has become increasingly prosperous and has attracted investors from all over the world. However, due to its anonymity, Ethereum has become a hotbed for various kinds of fraudulent activities, such as phishing scams, Ponzi schemes, honeypot schemes, rug pull scams, pump-and-dump schemes, and so on, which pose a serious threat to trading security on Ethereum. From Chainalysis 2022 Crypto Crime Report [Tea22], scams have been the largest form of cryptocurrency-based crime since 2017, leading to significant losses, which is shown in [Fig. 13]. Therefore, it is imperative to protect investors from scams and create a secure trading ecosystem on Ethereum. In this science note, we delve into the current academic landscape surrounding scam detection on Ethereum and summarise detection techniques of five kinds of common scams.
Challenges in Scam Detection on Ethereum#
There are three main challenges to be addressed in scam detection on Ethereum :
How to systematically analyse scams: Different types of scams may employ different methods and strategies, targeting different victims. Therefore, researchers need to collect and analyse a large amount of fraud case data to gain a deeper understanding of the characteristics and patterns of fraudulent behaviour.
How to extract effective features: The performance of scam detection is closely related to the choice of extracted features. Since fraudulent behaviour may exhibit subtle differences from normal behaviour, it is necessary to select discriminative features to distinguish between the two.
How to timely detect scams: Detecting scams timely is crucial to minimise losses and prevent more ordinary investors from falling victim to fraud. When scams are identified or predicted early, authorities and exchanges can take appropriate actions to freeze suspicious accounts and block fraudulent transactions.
Phishing Scam Detection#
Wu et al. [WYL+22] conducted the first investigation on phishing identification on Ethereum. Transaction information is very critical but cannot be captured by general random walk-based network embedding methods. Therefore, they proposed a novel network embedding algorithm called trans2vec to extract the features for subsequent phishing identification by taking the transaction amount and timestamp into consideration. They also assumed that a larger amount of value of the transaction implies a closer relationship between accounts and the later the transaction is, the greater the impact on the current relationship of the accounts.
New means of Non-Fungible Tokens (NFTs) phishing scams have emerged in the Ethereum ecosystem with the popularity of NFTs. Previous research lacks a systematic review and retrospective analysis of NFT phishing scams. Yang et al. [YLW23] collected 469 NFT phishing accounts and transactions and systematically summarised different patterns of NFT phishing scams, measuring the economic impacts and preferences of scammers. Interestingly, NFT phishers chose to transfer 57.5% of NFTs to their accomplices for further operations, accompanied by signs of gang theft. Detecting NFT phishing gangs and exploring withdrawal methods could be a potential research direction in the future.
Ponzi Scheme Detection#
Existing methods to identify Ponzi smart contracts can be classified into two categories: transaction behaviour-based detection [JLTGG19] and opcodes-based detection [CZC+18]. The former requires a considerable number of transactions to learn the behaviours, and the latter lacks interpretability. Chen et al. [CLS+21] proposed SADPonzi, a semantic-aware detection approach, which utilises the symbolic execution technique to extract semantic information from contract bytecode and match it with four semantic patterns of Ponzi contracts, ultimately identifying Ponzi contracts. Experimental results indicate that SADPonzi outperforms all the existing techniques in terms of accuracy and robustness. However, the symbolic execution technique has a limitation in handling evasion methods which can lead to serious path explosion.
Honeypot Scheme Detection#
To detect honeypot contracts early in their creation, Chen et al. [CGC+20] put forward a machine learning model for honeypot contracts detection based on N-gram features and LightGBM. They construct a series of N-Gram-based features and use a feature selection method to drop out those useless features. The model performs well in different imbalances of the data set. In the future, it is a potential way to combine the behaviour of contracts’ creators and features of contracts to get a more accurate classification model for detecting honeypot contracts.
Rug Pull Scam Detection#
Xia et al. [XWG+21] are the first ones to propose an accurate approach for flagging rug pull scams and the scam tokens on Uniswap based on a guilt-by-association heuristic and a machine-learning powered technique. The guilt-by-association heuristic technique helps to identify and expand obvious scam tokens and scammers. Machine learning-based detection helps to identify more scammers and scam tokens based on transactions on Uniswap. Interestingly, they found thousands of collusion addresses to help carry out the scams in league with the scam token/pool creators. Four kinds of collusion addresses can be seen in [Fig. 14].
However, the method proposed by Xia et al. [XWG+21] is only effective for detecting scams accurately after they have been executed. Mazorra et al. [MAD22] designed an accurate automated rug pull detection to predict future rug pulls and scams using relevant features of the pool’s state and the token distribution among the users. They use the Herfindahl–Hirschman Index and clustering transaction coefficient as heuristics to measure the distribution of the token among the investors. Additionally, they feed these features to train XGBoost and FT-Transformer models, respectively, and predict tokens before the malicious manoeuvre.
Pump-and-Dump Scheme Detection#
Telegram, with its relative anonymity, has fostered the organisation of pump-and-dump activities by many people in channels. Xu et al. [XL19] analysed features of pumped coins and market movements of coins before, during, and after pump and dump. They also built a predictive random forest model and a generalised linear model able to predict the coin being pumped before the actual pump event by Telegram channels using the information of market movements. In addition, they proposed a simple but effective trading strategy that can be used in combination with the prediction models, leading to fewer people falling victim to market manipulation and more people trading strategically. Different from the work in [XL19], La et al. [LMMSS23] built a machine learning model able to detect pump-and-dump schemes using the information of rush orders within 25 seconds from the moment it starts, instead of predicting it before it happens.
Conclusion#
The popularity of Ethereum has attracted a surge of fraudulent activities, posing serious risks to users. Detecting and preventing scams on Ethereum presents several challenges, ongoing research and innovative approaches are making significant progress in scam detection. Scam detection on Ethereum remains a worthwhile and pressing challenge in the field. Through ongoing exploration and innovation, we can collectively strive to build a more secure and trustworthy cryptocurrency trading ecosystem.
References#
- CGC+20
Weili Chen, Xiongfeng Guo, Zhiguang Chen, Zibin Zheng, Yutong Lu, and Yin Li. Honeypot contract risk warning on ethereum smart contracts. In IEEE International Conference on Joint Cloud Computing, volume, 1–8. Oxford, UK, Aug. 2020. IEEE.
- CZC+18
Weili Chen, Zibin Zheng, Jiahui Cui, Edith Ngai, Peilin Zheng, and Yuren Zhou. Detecting ponzi schemes on ethereum: towards healthier blockchain technology. In Proceedings of the ACM Web Conference, volume, 1409–1418. Lyon, France, April 2018. ACM.
- CLS+21
Weimin Chen, Xinran Li, Yuting Sui, Ningyu He, Haoyu Wang, Lei Wu, and Xiapu Luo. Sadponzi: detecting and characterizing ponzi schemes in ethereum smart contracts. In Proceedings of the ACM on Measurement and Analysis of Computing Systems, volume 5. New York, NY, USA, June 2021. ACM.
- JLTGG19
Eunjin Jung, Marion Le Tilly, Ashish Gehani, and Yunjie Ge. Data mining-based ethereum fraud detection. In IEEE International Conference on Blockchain, volume, 266–273. Atlanta, USA, July 2019. IEEE.
- LMMSS23
Massimo La Morgia, Alessandro Mei, Francesco Sassi, and Julinda Stefa. The doge of wall street: analysis and detection of pump and dump cryptocurrency manipulations. ACM Transactions on Internet Technology, Feb. 2023.
- MAD22
Bruno Mazorra, Victor Adan, and Vanesa Daza. Do not rug on me: leveraging machine learning techniques for automated scam detection. Mathematics, 10(6):949, Mar. 2022.
- Tea22
Chainalysis Team. The 2022 crypto crime report. Feb. 2022. URL: go.chainalysis.com/2021-crypto-crime-report.
- WYL+22
Jiajing Wu, Qi Yuan, Dan Lin, Wei You, Weili Chen, Chuan Chen, and Zibin Zheng. Who are the phishers? phishing scam detection on ethereum via network embedding. IEEE Transactions on Systems, Man, and Cybernetics: Systems, 52(2):1156–1166, Feb. 2022.
- XWG+21(1,2)
Pengcheng Xia, Haoyu Wang, Bingyu Gao, Weihang Su, Zhou Yu, Xiapu Luo, Chao Zhang, Xusheng Xiao, and Guoai Xu. Trade or trick? detecting and characterizing scam tokens on uniswap decentralized exchange. In Proceedings of the ACM on Measurement and Analysis of Computing Systems, volume 5, 1–26. New York, NY, USA, December 2021. ACM.
- XL19(1,2)
Jiahua Xu and Benjamin Livshits. The anatomy of a cryptocurrency Pump-and-Dump scheme. In Proceedings of the 28th USENIX Conference on Security Symposium, 1609–1625. Santa Clara, CA, Aug. 2019. USENIX Association.
- YLW23
Jingjing Yang, Jieli Liu, and Jiajing Wu. With trail to follow: measurements of real-world non-fungible token phishing attacks on ethereum. arXiv preprint arXiv:2307.01579, 2023.