Title: Behind the Scenes of a Market and Competitive Intelligence Platform
1Key Challenges
Behind the Scenes of a Market and Competitive
Intelligence Platform
2Table of Contents
- Introduction
- Challenges faced while building a Market
Competitive Intelligence platform - Sourcing of information
- Removing irrelevant information
- Removing duplicate or similar information
- Identifying companies and persons
- Confusions about companies and the mentions
- Specifying Industry and topics of the article
- Perspective of the social media
- Conclusions and takeaways
3Introduction
- Overview Building a Market and Competitive
Intelligence Platform
4Introduction
It took us a long time to build a market and
competitive intelligence platform
- The platform is devised to continuously monitor
thousands of websites for new information on
competitors, customers, industries and other
signals such as sales opportunities. - All this which fits in a single line is a work of
constant monitoring, testing and implementation
conducted over a decade. - In todays time, everyone knows the importance of
such competitive intelligence platform and some
CTOs are even confident that such a platform can
be build over a month with five engineers. - But the pointers ahead in this deck will prove
that while it might be easy to start this project
but it is painfully difficult to finish it.
5Key challenges faced while building a MI platform
- Sourcing of information
- Removing irrelevant information
- Removing duplicate
- Identifying companies and persons
- Confusions about the mentions
- Specifying Industry and topics
- Perspective of the social media
6Sourcing of information
Integrating thousand websites with new
information that is continuously monitored.
Taking up the task of building a market and
competitive intelligence platform comes with
unique challenges
Removing irrelevant information
Removing information that is not relevant to
ones business which is most web data.
Removing duplicate information
Comparing the new information with everything
else in our database.
Identifying companies and persons
Building the capability for technology to
identify relevant companies and people.
Confusions about companies and the mentions
Managing the complexity of the problem of
aboutness in the information collected.
Specifying Industry and topics of the article
Analyzing the aggregated information from
different industries and topics
Perspective of the social media
Integrating different platforms and detecting the
accuracy of information there.
7Sourcing of information
1
- Most of the websites post information for humans
to read, not for a software to integrate - Interpreting information correctly from a website
- Integration with unique websites
- No universal standards for website development
- Analyst spends time in analyzing each insight.
- Scrapping of the intelligent web pages is not
easy because they are responsive, dynamic and
personalized. These use cookies, JavaScript, AJAX
calls for generating a unique web page for user. - Dynamic name of the webpages that issues no
warning before changing the whole scenario.
8Removing irrelevancy
2
- Defining principles for data relevance is
difficult for the dynamic and unique nature of
information on web. - Contify fine-tunes this with learnings from the
data that comes at a very high technical and
operational cost. - Removing the non-business information right at
the source such as crime, politics,
entertainment, sports - E.g. we can remove the stories with the word
kill in the title with the possibility that
they are crime related, but we cannot ignore
stories like Google aims to kill passwords. - Remove the information related to business but
not relevant for business. - E.g.- information about our industry but from a
different geography, or information about our
competitor but for a different segment where we
dont compete.
9Removing duplicates
3
- Comparing the new information with our database.
But websites do not duplicate in a manner that
triggers copyright or google algorithms to appear
unique in search optimizations. - Leveraging machine learning standard programs
group similar articles as they use efficient
clustering algorithms with reasonable accuracy.
But the next challenge is they incorrectly group
different articles or fail to group similar ones
being a machine. - Google spent so much time to define such
algorithms. We struggled in figuring out cracks
of such techniques. - Identifying the real article that is being
duplicated and not the other way round. We
continued on our journey of Now what?
10Identifying companies
4
- In text analytics this is called Named Entity
Recognition. - Looking for words that have the first letter in
uppercase like ICICI. We can achieve this with
some elementary text processing. Now, if the
following word also starts with a capital letter
then it is a part of the same name, e.g. ICICI
Bank. This could be true for the third word also,
ICICI Bank Ltd. So there are different patterns
for different identifications. - Company names which are common words, such as
Apple, Amazon, Gap are difficult to be recognized
as company names by the algorithms. For this, we
need to again look for other signals in the
article. - Common English words cause a lot of confusions in
ordinary articles
11Specifying Industry
5
- The industries are not set up in clear web of
divisions. - Market Intelligence platform need to analyze the
aggregated information by industries topics
like partnerships, business expansion, new
offerings. - No rules to fine-tune the classification
algorithms to recognize words commonly used to
describe an industry - Reaching accuracy is very difficult but in order
to be a sustainably reliable competitive
intelligence platform, there are not many shots
to just try things - Example- a story reveals which company has
acquired what company and investment of which
bank is involved, it can easily be interchanged
and turned out as a banking acquisition.
12Companies mentions
6
- How to know whether the story is about the
company or just mentions the company? This is the
problem of the aboutness of the information. - Example- a story that say- Amazon, Microsoft,
Google, and Oracle are also offering cloud
computing solutions. Clearly, it mentions
Microsoft. We dont want our intelligence users
to get this in their updates for Microsoft. - To address it we gave relevance scores to all the
companies in each article. - It is dependent on a lot of factors and knowledge
base. For example, for products and services
signal, we need a knowledge base of all the
products and services of the company.
13Social media
7
- Social media is a web of information with very
less quality information that needs extraction. - Extracting a few relevant pieces of information
from tons of mindless shares, tweets, and
retweets is like finding a needle in a haystack
without a magnet. - Our intelligence engine rejects more than 95 of
social updates from companies. - Increasing complexities on social media with the
new hacks of marketing. Companies have different
accounts not only for different regions but for
different departments too. - It is not easy to reach the right place, right
article, authentic profile of the companies in
the junk of data on social media platforms.
14Conclusions and takeaways
15- Data is a goldmine on web but to extract the gold
out of the trash is a task that not everyone is
capable of. - Example- Apples business strategy section of the
annual report had just two additional words in
2002 that were not there in 2001. These were
cellular phones. Yet, many were surprised when
Apple, a computer company, launched iPhone five
years later. - Competitive Market Intelligence is not an easy
reach for any team of developers but it is
optimized keeping in mind efficiencies of the
organization and need to support better internal
decision making
Key takeaways
Put this kind of effort in building a market
intelligence platform only if that is the core of
your business. If not, then building one would
not be wise even if you have a great technology
team.
16Thank you
Choose the industry-best Competitive Intelligence
and Market Research system
Contify
Start a conversation
Read More
marketing_at_contify.com
https//bit.ly/34zsYy7