Creativity is critical to producing superior results in any field, whether it’s the arts, sciences, or technology. The same can be said for alternative data analysis.
The most successful data scientists apply creativity by thinking out of the box when analyzing information to solve challenging problems or understand human behavior. From investment decisions to public policy and market research, creative use of alternative data is becoming key to unlocking valuable insights and gaining a competitive edge in the industry.
Alternative Data Pioneers
Digital data comes from many sources; however, none has transformed the economic and business landscape as profoundly as alternative data. As a creative endeavor, alternative data revolutionized financial analysis and investment strategies.
Several years ago, some investment firms started using new sources of digitized information instead of relying on traditional ones like government reports and company financial statements. The alternatives included satellite imagery, credit card transaction data, mobile app data, and public web intelligence. When combined with traditional sources, alternative data led to unique and timely insights that gave forward-thinking investment professionals greater returns.
Beyond Investment: Alternative Data in Other Industries
Today, the alternative data industry is worth almost $7 billion. The main drivers behind such a revolution are business digitalization and advancements in web scraping technologies. As extracting web intelligence is easier than before, it is no longer the exclusive domain of hedge funds and investment banks. Other organizations using alternative data include academic institutions, NGOs, marketing agencies, insurance companies, and even central banks. It is also an indispensable source of information for investigative journalism.
There are many ways in which businesses utilize alternative data insights. For example, market research firms extract sentiment data to understand buyer preferences better, track emerging trends, and identify changes in the marketplace. Logistics and supply chain companies use data from the Internet of Things (IoT) devices as well as satellite imagery to track inventory, manage risk, and optimize shipping routes. Real estate agencies use satellite imagery and traffic data to project real estate values and identify new trends in the residential and commercial markets.
The list of examples is almost endless – with the right web intelligence gathering tools and some creativity, data analysts today can access a limitless online dataset for real-time insights. Here are some creative examples to follow:
1. Analyzing investor sentiments
Direct investment data can be used to evaluate how investor sentiments affect markets. For example, data from stock tracking websites can be analyzed in conjunction with anonymous data scraped from investment forums and discussion groups to identify connections between investor (as an aggregate group) sentiment and the shifting value of stocks, cryptocurrency, and other financial instruments. Interactive Brokers, one of the largest electronic trading platforms in the U.S., has started to produce reports that include daily stock tickers mentioned the most times on Reddit’s r/wallstreetbets subreddit.
2. Tracking online consumption
Online consumption in Japan increased significantly during the COVID-19 lockdowns. However, policymakers were unsure what types of households increased online spending and if this increase was permanent.
To answer this question, researchers from the Bank of Japan conducted an empirical analysis using transaction data compiled by the financial management service Money Forward, Inc. and granular data from the Survey of Household Economy by the Ministry of Internal Affairs and Communications.
Analyzing both data types, the study authors concluded that online consumption increased across a wide range of age and income groups. Furthermore, households that engaged in online spending during the lockdowns continued this behavior even after regular business activities resumed.
3. Tracking production activity
Another type of alternative data – mobility data – is typically gathered from GPS-enabled devices such as smartphones and tablets. A study from the Bank of Japan examined the creative use of mobility data to enable “nowcasting” – the prediction of the recent past, present, and near future state of economic indicators.
Researchers used mobility data to measure sales in service industries and production activity in the manufacturing industry. As a result, they managed to “nowcast” economic activity with a high level of precision – something previously impossible with conventional statistics.
4. Measuring the effects of macroeconomic policy
Data from online marketplaces and retail shops can be used to assess the effect of economic policies on the prices of consumer goods. By tracking prices of widely available items (such as books, clothes, food, or electronics), researchers can gain insights into how macroeconomic factors influence market dynamics.
5. Measuring the impact of a reputation crisis
Scandals, “cancel culture,” and other public relations crises can significantly impact a brand’s reputation, resulting in lost clients and sales. To measure the effect of an adverse reputational incident, analysts can scrape public brand mentions from social media, forums, or other sites and compare them with sales data to determine if the incident actually impacted revenue.
Collecting Alternative Data
There’s no question that using alternative data is vital to extracting relevant, accurate insights necessary to formulate precision strategies and enhance decision-making. However, collecting it might require specific knowledge as alternative data is usually unstructured and comes in multiple formats.
The most popular sources of alternative data include:
Data vendors: Firms that provide satellite imagery, web traffic statistics, and public social media data
Crowdsourcing: Platforms like CrowdFlower, Prolific Academic, and Amazon Mechanical Turk that collect data on consumer behavior, sentiment analysis, and more
Web scraping: The process of automatically extracting publicly available data from websites on a large scale and, often, in real time. Such data can include product and service prices, public customer reviews and comments, stock values, and any other public information available on the internet.
It is important to note that alternative data can produce rather weak signals if compared to traditional sources. Most alternative data captures short time windows and, as such, can only be useful for deriving highly specific insights within a time frame of five years or less. On the other hand, such insights might be the key to winning the business competition or advancing scientific research.
How Web Scraping Works
Web scraping works by using scripts and/or software tools to automate the web intelligence collection. The web scraping process includes the following:
1. Identifying the target website and choosing the data to be extracted, including links, text, images, and other types of information
2. Analyzing the website’s code to identify the HTML elements containing useful data
3. Writing the script to automatically extract useful data from the HTML document using some programming language (such as Python) and a web scraping library like Scrapy or BeautifulSoup, usually producing a CSV or JSON file
4. Running the scraper to automatically visit the target website, extract and parse the data, and store it in a structured format that analysts can read
While the above steps may seem simple, the web scraping process is technically challenging, with multiple obstacles. It is important to note that collecting web intelligence at scale requires residential or datacenter proxies to distribute requests, provide anonymity, and prevent server blocks.
Fortunately, the latest web scraping solutions usually take care of challenging technical issues, be it automated proxy management, CAPTCHA solving, or data extraction and parsing.
Final Thoughts
Alternative data is a Holy Grail for organizations, businesses, and individual researchers that look for unique insights or simply need to get ahead of the competition. As a result, the global demand for alternative data is expected to increase at a compound annual growth rate (CAGR) of 52.1% from 2023 to 2030.
However, utilizing alternative data is almost impossible without a growing awareness of web scraping possibilities. Web intelligence is the backbone of the digital revolution – to make successful data-driven decisions, accessibility of large amounts of public information and the quality of extracted data is key. Modern solutions, such as scraping APIs and web data unblocking tools, enable organizations to focus more on the data analysis itself instead of the complex technical peculiarities of the data collection process.