![Bitcoin image](https://www.etsmtl.ca/uploads/coinjoin-0.jpg)
Summary
Bitcoin is the best known and most dominant cryptocurrency on the market, thanks to its pseudo-anonymous nature, designed to protect the identity of its users. While there is research into attempts to de-anonymize network users, techniques such as CoinJoin have been developed to circumvent these efforts. CoinJoin allows multiple users to aggregate their transactions into a single one [1], making it more difficult to trace financial flows and reinforcing anonymity. However, this method opens the door to illicit activities by exploiting the increased level of confidentiality. Faced with this challenge, our research aims to develop a system that can identify CoinJoin transactions using machine learning techniques.
CoinJoin Transactions
Bitcoin is the most widely used cryptocurrency on the market, making up approximately 54% of the total value of cryptocurrencies [2]. A key aspect of its popularity is the anonymity it offers its users. Users are identified by cryptographic addresses, and transactions are validated via a decentralized process known as “proof-of-work.”
To circumvent attempts to identify users in the Bitcoin network, techniques such as CoinJoin were developed, combining multiple transactions into one to complicate their traceability. This service was subsequently exploited and used for fraudulent and illegal activities. The aim of our research is to detect these CoinJoin transactions, in order to identify suspicious financial flows without compromising the confidentiality of legitimate users. Existing studies have their limitations, notably in the management of unbalanced and unlabeled data, as well as the lack of historical analyses. To overcome these problems, we traced transactions back to their origin, optimized machine learning algorithms using the OPTUNA tool, and thus improved detection of suspicious transactions.
CoinJoin Transaction Detection
We used machine learning techniques to develop a system that can detect transactions more accurately and efficiently than current methods. To achieve this, we first collected information on Bitcoin transactions directly from a complete node. This allowed us to track transaction history and extract the necessary data using our own script. Next, we cleaned up the data to remove what was not useful and formatted it for analysis. Then, we identified the important elements that would help us better understand these transactions. Our analysis allowed us to retain only the most useful elements, thus improving the performance of our system.
We then divided the data into two groups: 80% to train our system and 20% to test its performance. This ensured that the system is learning well and can be used for new data. The tool we used to optimize our system, called Optuna, enabled us to select the parameters that would deliver the best results. We evaluated the efficiency of the models using several indicators—including Precision, Recall and F1 Score—to measure their ability to correctly identify transactions. Finally, we used software called Neo4j to create graphs that visually show suspicious transactions, helping us to spot unusual behavior.
![flowchart describing the various stages of our mixed Bitcoin transaction detection system.](https://www.etsmtl.ca/uploads/coinjoin-1fr.jpg)
Parameter Optimization With Optuna
In our study, one of our objectives was to see how adjusting the parameters of our models could improve their performance. To achieve this, we used a tool called Optuna, which allowed us to find the best parameters that make models more efficient. Other tools exist, such as Auto-Sklearn or TPOT, but Optuna stands out for its speed, ease of use and adequate performance, even with large amounts of data.
We first tested reference models without optimization. These served as a benchmark against which to measure the improvements brought by optimizing with Optuna. Before adjusting the parameters, the results of the models varied. However, after using Optuna to fine-tune the parameters, we saw a noticeable improvement in all performance indicators. One of the most striking results was that our system was able to detect all CoinJoin transactions with no “false negatives.” A false negative occurs when the model fails to identify a CoinJoin transaction and wrongly classifies it as a normal transaction. This can represent a significant risk to financial organizations. Our results therefore underline the importance of optimizing model parameters to better detect dubious transactions and avoid critical errors.
![optimization with Optuna](https://www.etsmtl.ca/uploads/coinjoin-2en.jpg)
In-Depth Analysis of CoinJoin Transactions Using Neo4j Graphs
We analyzed CoinJoin transactions in detail, representing them in graphical form using the Neo4j tool. Each transaction is visualized as a “node,” and the links between inputs and outputs are represented by “connections” between these nodes. This allows us to see how transactions relate to each other, and to spot patterns. For a given transaction, we can find all associated transactions up to a certain level. Each node contains information such as the number of entries and exits, quantity of Bitcoin, and type of address. For example, in a CoinJoin transaction, we can observe a pattern with 5 inputs and 5 outputs, which is typical for this type of transaction.
![specific 5-input, 5-output pattern for Samourai Whirlpool CoinJoin transactions.](https://www.etsmtl.ca/uploads/coinjoin-3en.jpg)
Conclusion
Our system could provide financial institutions, cryptocurrency platforms and regulators with powerful tools for more effective monitoring of financial flows on blockchains, and detection of fraudulent behavior. This would help boost transparency in the cryptocurrency ecosystem while preserving the privacy of legitimate users.
Additional Information
For more information on this research, please refer to the following paper:
O. Dekhil, R. T. Naha, M. M. Feridani,F. Najjar and K. Zhang, "Detecting Bitcoin CoinJoin Transactions Using Machine Learning," 2024 IEEE Blockchain Computing and Applications (BCCA)
References
- F. K. Maurer, T. Neudecker and M. Florian, "Anonymous CoinJoin Transactions with Arbitrary Values," 2017 IEEE Trustcom/BigDataSE/ICESS, Sydney, NSW, Australia, 2017, pp. 522-529, doi: 10.1109/Trustcom/BigDataSE/ICESS.2017.280. keywords: {Bitcoin;Joining processes;Delays;Public key;Telematics;Peer-to-peer computing},
- https://coincodex.com/bitcoin-...
For additional information on APA Style formatting, please consult the APA Style Manual, 7th Edition.