From hedge fund managers to mutual funds and even private equity managers, alternative data has the power to improve valuation of securities and boost the clarity of the investment process.
It enriches the structured data sets already acquired by investment management firms, fueling the potential for information advantage and providing a distinct differentiator in terms of speed and knowledge. In today’s competitive market, even a short window of opportunity or information advantage can lead to a dramatic boost in returns.
Sentiment data, geolocation data, credit/debit card transaction data, satellite data, email receipts data, web scraping data, web traffic data, and more are all valuable sources of information that contain latent insights useful for systematic and discretionary investing. Alternative data directly supports the goals of investment firms to help them outperform competition or benchmarks, and its use is becoming more prevalent.
Techniques like natural language processing and machine learning allow organizations to better capitalize on alternative data. These technologies enable processing of large, heterogenous, and unstructured sets at an extremely fast rate.
This article explores the challenges for alternative data adoption, how to overcome them, and explores the potential of automation.
Alternative data enriches the structured data sets already acquired by investment management firms, fueling the potential for information advantage and providing a distinct differentiator in terms of speed and knowledge. In today’s competitive market, even a short window of opportunity or information advantage can lead to a dramatic boost in returns.
Sentiment data, geolocation data, credit/debit card transaction data, satellite data, email receipts data, web scraping data, web traffic data, and more are all valuable sources of information that contain latent insights useful for systematic and discretionary investing. Alternative data directly supports the goals of investment firms to help them outperform competition or benchmarks, and its use is becoming more prevalent. According to Deloitte, the industry is still in the early adopter phase, but the spending on alternative data by trading and asset management firms may exceed $7 billion by 2020.
The Hurdles of Alternative Data Adoption
With much of the industry moving to widely adopt alternative data, it is paramount for investment firms to incorporate this strategy to avoid being left behind. However, these firms need also to consider the talent, capabilities, and infrastructure required to keep up with the proliferation in data sources and innovation in extracting alpha from these alternative datasets. To effectively incorporate alternative data into the investment or portfolio construction process, organizations will need both commitment and access to specialized approaches to analyzing and visualizing data.
Alternative data is highly unstructured and comes in multiple formats, making it challenging and time-consuming to analyze for the creation of investment models. Quants must spend a large amount of time programming models, debugging code, and integrating multiple data sources. While advanced analytics, including machine learning and natural language processing, can help crack this puzzle, just a handful of firms have started adding the specialized talent required to implement these strategies. The average quant can take 10 weeks to seven months to develop, code, test, and launch such strategies. Democratization and automation are a must.
The Time is Right for Automation
Techniques like natural language processing and machine learning allow organizations to better capitalize on alternative data. These technologies enable processing of large, heterogenous, and unstructured sets at an extremely fast rate. According to a survey from Strategic Consulting, 62% of systematic managers are investing in these technologies. However, how can organizations ensure that these artificial intelligence approaches keep up with the and prevalence and rate of growth of alternative data?
SparkCognition has developed a portfolio of solutions that automate the data science process and put the power of NLP and machine learning in the hands of quants. With DeepNLPTM, a solution that adds structure to unstructured data, analysts can automatically extract information from speeches, news stories, television, press releases, presentations, websites, web traffic, IoT sensors, proprietary databases and government databases. DeepNLP allows analysts to extract custom ontologies and automate the generation of summary reports by indicating keywords and names for desired topics.
Overview of Alpha Process and Key Developments
Unstructured Data and Structured Data
Once the data has been structured, DarwinTM, SparkCognition’s automated model building solution, can create machine learning models on demand. Darwin brings data science expertise to investment firms, augmenting analyst capabilities to prototype, develop, and deploy models at scale without requiring data science expertise. These models continuously learn from data sources to recommend various trading and positioning actions based on investment goals, strategy, risk tolerance, and overall macro state.
Structured Data and Predictions
Source: SparkCognition