- Synthetic Data News
- Posts
- Synthetic data in finance & more
Synthetic data in finance & more
Latest initiatives from the FCA, guidances from the UNECE and new tools
Happy Wednesday, and thanks for joining this list! In this issue, I regrouped the latest FCA initiatives around synthetic financial data, recent guidances published by the UNECE, new tools, and recent research on synthetic data.
Enjoy!
đź“° Recent initiatives on synthetic data
Findings from the Financial Conduct Authority (FCA) research: In a recent effort to understand how to drive the adoption of synthetic data in the financial market, the FCA put out a Call for Input. They released the results. Those include feedback on common financial data access challenges, financial synthetic data generation techniques, use cases, and the expected role of the regulator. (link)
Answer from the FCA Feedback Statement on financial synthetic data use case.
FCA is starting a “Synthetic Data Expert Group”: This group is expected to identify relevant use cases for synthetic data in financial markets, clarify issues in synthetic data, and develop best practices. They accept applications from regulators, consultants, legal professionals, accelerators, consumer groups, and academia profiles until March 8th. (link)
How inputted data impacts the generation of financial synthetic tabular data with VAEs: The research team at Royal Bank of Canada and researchers from the University of Toronto published a paper on the interpretability of synthetic financial data generated with VAEs. They proposed a sensitivity-based method to assess the impact of input features in a tabular dataset on how VAEs synthesize data. (link)
Synthetic data for official statistics: The United Nations Economic Commission for Europe (UNECE) has published a guide that explores synthetic data's use to share sensitive microdata for statistical analysis. It also provides guidance for implementing synthetic data. (link)
âš™ New synthetic data companies and tools
Tasq AI offers a QA engine for unstructured synthetic data. (link)
Titan Synthetics (formerly Kairos Technologies) offers a synthetic data solution for the healthcare industry. (link)
The van der Schaar Lab launched its open-source synthetic data initiative Synthcity. (link)
Artificial Pixels provides services around unstructured synthetic datasets. (link)
REaLTabFormer is a tabular and relational synthetic data generation model to generate synthetic data using transformers. (link + code)
đź“Ł From the community
Synthesize 2023 replay: Gretel hosted the first edition of their developer conference on synthetic data. They had speakers from synthetic data startups, companies working on ML projects, and research institutions. All the replays are available on YouTube. (link)
Machine Learning for synthetic data generation: This paper reviews existing works on synthetic data generation using machine learning models, covering various applications and machine learning methods, and privacy and fairness issues. (link)
Performance of fraud detection models on rebalanced datasets: This blog from Statice compares the performance of models trained on original, SMOTE, and synthetic transaction datasets. (link)
UCLA synthetic data workshop: The UCLA Department of Statistics is hosting a two-day workshop on April 13th - 14th. (link)
Introduction to synthetic data for researchers”: The workshop replay from UK Data Service is available, along with the demo code on their GitHub. (link)
Synthetic transportation data of 1 million individuals: Researchers developed a trip generation method that balances data availability and individual trip privacy protection. (link)
I'm posting this content every two weeks. You can subscribe below to receive it by email or forward it to a friend. Have a great week. ✌