A provider of real estate and financial industry news and information through research, real-time data, analytics products, trade publications and events, generating a revenue of above US$20 million, annually.
The company was in the process of exploring opportunities in certain emerging markets, and planning to accelerate market intelligence processes to accurately measure commercial and residential lending performance in target locations. The primary requirements were
Data sources related issues: Identifying sources of obtaining official or authentic information on property types, tax, lending rates and transactions, mortgage, yields, indices, etc. as well as quickly compiling the required data was a huge challenge. With data dispersed across different types of public and private resources.
Data preparation issues: As a) lot of information was located in unstructured data sources, and b) incorporating updates or revisions, which were published in different methods and formats across locations was tricky.
Metrics and calculations related difficulties: lack of harmonization in computation purposes and indices compilations across countries and locations, with differences in valuation rules, industry information standards and statistical requirements, etc.
New Database Structure and Integration System
Allowing easy storage, retrieval, integration and dataset preparation from 100s of file formats, both structured and unstructured, despite the wide diversity of data sources across target geographical locations. This allowed:
Market Visibility Dashboard
Allowing users to track, compare and analyze commercial and residential lending, with filters for target geographical countries and cities, and with ability to narrow down on loan types and ranges across lenders, property types, which allows
Performed research to identify different official or authentic data sources and types across target locations
Set up semi-automated data collection and validation process, using crawlers/scrapers for certain sites and scripts/codes to extract files and data, and by aligning county specific upload schedule to make sure the data is extracted as soon as it get refreshed. Performed manual data identification and capture procedures for highly secure public databases via proxy servers.
Designed a data warehouse for storage and integration of all data sources (internal and external), and an online PHP application for manual data updates.
Developed a data cleansing and transformation module for getting the data ready for further analysis/reporting involving through QC. And set up a procedure for manual data key-ins where automated solution was not able to pick correct information from unstructured documents.
Consulted relevant financial experts to design metrics that can be comparable across locations, keeping in view the diverse rules, standards and statistical requirements.
Created a dashboard with features such as root cause analysis / drill down
To measure commercial and residential lending, sales & foreclosures across various parameters such as
To generate alerts and notifications regarding user-defined market events or changes