An Dr. Andrew Duchon
Director of Data Science at Manzama, Inc.
Business development and competitive intelligence both require understanding what is happening with client companies, prospective clients, and even entire industries. One attorney (or any B2B salesperson) might have 10 clients and 90 prospective clients, each of those might have, ar an meán, 200 articles a month appearing in the news, noting events taking place with respect to the company, e.g., deals they are making with other companies, executives coming or going, sales or other financial reports, and interactions with regulators. That’s 20,000 articles per month an attorney might need to read to be fully up-to-speed; and that does not even account for general news and industry trends.
Instead of taking this bottom up approach to understanding the big picture, Manzama’s Insights™ uses machine learning to process and organize all that data so that users can quickly get the big picture first, then dive into particular aspects relevant to them. The first question is what are the colors of that big picture? Insights classifies news concerning a company into 25 subfactors which are are grouped into six factors as shown in the first six columns in the table below. The seventh column shows the Ignored factor which contains six kinds of company mentions that may be relevant to understanding a company generally, like conferences and marketing, but are not relevant to their corporate health which will be discussed below.
Insights Factors and Subfactors
|Financials||Govern- ment||Partners & Com- petitors||Oibríochtaí,,en,Táirgí,,en,Seirbhísí,,en,Bainistigh,,en,Ignored,,en,Anailísí,,en,Polaitíocht,,en,Comórtas,,en,Ionsaithe,,en,Tubaistí,,en,Gluaiseacht Feidhmiúcháin,,en,Féimheacht,,en,Rialachán,,en,Margaí,,en,Ceisteanna Cibearra,,en,Feidhmeannaigh,,en,Coireacht,,en,Cánacha,,en,Cumaisc,,en,Éadálacha,,en,Leathnú,,en,Contraic,,en,Dliteanas Táirge,,en,Transider istigh,,en,gníomhartha,,en,Stoc Nuacht,,en,Fostaithe,,en,Teachtaireacht Phoiblí,,fr,Mí-iompar,,en,Neamh-Béarla,,en,Slabhra soláthair,,en,Díolacháin,,en,sealbhóirí,,en,Neamh-Sprioc,,en,Spam,,en,Empirically,,en,fuair muid amach gur féidir gach nuacht ábhartha faoi chuideachta a rangú i gceann díobh seo,,en,subfactors agus ní mór dúinn,,en,leathanaigh treoirlínte chun cabhrú linn na cinntí seo a dhéanamh,,en,nílimid féin ag aicmiú go pearsanta na 100 mílte earraí a thagann isteach i Manzama gach lá,,en,Sin an áit a thagann foghlaim meaisín isteach,,en,Táimid tar éis oiliúint a dhéanamh ar líonra domhain nórach chun na focail a chur i gceannlíne faoi chuideachta i líon na n-uimhreacha a fhéadfaidh an líonra neodrach a phróiseáil chun a chinneadh,,en||Products & Services||Manage- ment||Ignored|
|Analyst||Politics||Competition||Attacks & Disasters||Maoin Intleachtúil||Executive Movement||Comhdháil|
|Financials||Taxes||Mergers & Acquisitions||Expansion & Contraction||Product Liability||Insider Trans- actions||Margaíocht|
|Stock News||Employees||Public Sentiment||Misconduct||Non-English|
|Supply Chain||Sales||Comhroinn- holders||Non-Target|
Empirically, we have found that all relevant news about a company can be classified into one of these 25 subfactors and we have 30 pages of guidelines to help us make these determinations.
Ar ndóigh,, we ourselves are not personally classifying the 100s of thousands of articles coming into Manzama every day. That’s where machine learning comes in. We have trained a deep neural network to turn the words in a headline about a company into numbers which can be processed by the neural network to determine which of the 25 tá pléachtóirí ábhartha á phlé,,en,cinneann an líonra "luas" na nuachta don chuideachta,,en,dearfach,,en,diúltach,,en,nó neodrach,,en,Go ginearálta,,en,Léiríonn earraí diúltacha nuacht "olc" faoin gcuideachta nó a léiríonn go gcuirfí crapadh ar an gcuideachta,,en,cé go dtugann ailt dearfacha nuacht "mhaith" faoin gcuideachta nó a léiríonn fás,,en,I gcásanna eile,,en,léiríonn sé an bhfuil an gaol idir an chuideachta agus aon eintiteas eile maith nó olc,,en,Is gnách go bhfuil cineálacha éagsúla earraí earraí neodracha ann,,en,níl sé soiléir go bhfuil maitheas nó bochta na nuachta,,en,níl aon tionchar fíor nó léiriú ar shláinte chorparáideach ar an nuacht ach níor cheart é a eisiamh,,en,tá an nuacht dearfach agus diúltach,,en,"Bhí sé seo go maith,,en,ach bhí sé sin dona ",,en,tá an droch-nuacht á neodrú ag an gcuideachta,,en.
Ina theannta sin, the network determines the “valence” of the news for the company: positive, negative, or neutral. In general, negative articles indicate “bad” news about the company or are indicative of the company shrinking, whereas positive articles indicate “good” news about the company or are indicative of growth. In other cases, it reflects whether relationship between the company and some other entity is good or bad. Neutral articles are generally of a few different types: the goodness or badness of the news is unclear; the news has no real impact or indication of corporate health but should not be excluded; the news is both positive and negative, e.g., “this was good, but that was bad”; the company is neutralizing bad news; nó plé a dhéanamh ar imeacht tar éis dó tarlú cheana féin,,en,is féidir le níos mó ná cuideachta amháin an nuacht,,en,mar sin déanann Léargais anailís ar an nuacht i gcoibhneas le gach cuideachta a luaitear,,en,"Tacaíonn KPMG PwC in FTS,,en,Is é an Mhargadh Iniúchóireachta "dea-scéal le haghaidh KPMG agus droch-nuacht do PriceWaterhouseCoopers i bhfo-bhfaisnéis an Chomórtais,,en,Ar an iomlán,,en,Úsáidtear an t-iarmhéid seo de nuacht dhearfach agus dhiúltach chun "Scór Sláinte" corparáideach a ríomh. Gnáthnóimid an scór seo ó,,en,an-sásta,,en,an-shláintiúil,,en,Tá an chuid is mó de na cuideachtaí ar fud timpeall,,en,a mheastar a bheith gnáth,,en,Tarlaíonn an saol,,en,rudaí dona a tharlóidh,,en,ach is féidir leis an gcuideachta iad sin a shárú agus a choinneáil ag fás,,en,Is cuideachta is gnách é sin,,en,Uaireanta téann gach rud ar a mbealach,,en,Ach is dócha go leanfaidh sé ar aghaidh,,en,le go leor droch-scéal,,en,d'fhéadfadh cuideachta dul féimheach,,en. Ar ndóigh,, the news can concern more than one company, so Insights analyzes the news relative to each company mentioned. “KPMG overtakes PwC in FTS 100 Audit Market” is obviously good news for KPMG and bad news for PriceWaterhouseCoopers in the subfactor of Competition.
On the whole, this balance of positive and negative news is used to calculate a corporate “Health Score.” We normalize this score from -10 (very unhealthy) chun +10 (very healthy). Most companies range around around 0, which is considered normal. Life happens, bad things happen: but can the company overcome those and keep growing? That’s a typical company. Sometimes everything goes their way. But how likely is that to continue? Ar an láimh eile,, with enough bad news, a company might go bankrupt, or have their remaining assets merged into another company and thereby disappear. Using Manzama’s historical data, we’re modeling how these possibilities play out.
Besides category and valence, each relevant article is also labeled True or False for having an element of Litigation, Rumor, or Opinion, each labeled independently, which we call Aspects. By Litigation, we mean are the courts or lawyers involved. Rumor refers to actual rumors, leaks, forecasts, or other events that have not definitively occurred. With Opinion, we hope to classify articles as representing a personal opinion (vs. a fact) including speculation, commentary, or otherwise unverifiable information.
After all the news about a company has been classified as to its subfactor, valence, and aspects, the next step is to group the articles together for the user. Of those 20,000 articles mentioned hypothetical above, there is likely a lot of redundancy: there are not 20,000 different events taking place, just re-syndicated or re-worded articles about the same, say, 2,000 Imeachtaí. Insights helps here as well. On a given day, for a given company, within a subfactor, there is likely to be just one or two events occurring. So Insights clusters those articles into a “stories,” each discussing just one event from one aspect. Le himeacht ama, across days, stories are grouped into “storylines.”
By starting at the storyline level, users can get a big picture and then see how events have played out over time very quickly. To take a recent example, in early December 2018, Google decided to shutdown its Google social network, earlier than anticipated. The AI in Insights connects that news to stories two months earlier to give the user a broader context for this news.
Using Insights, a lawyer working in Cyber Security could be alerted to just those types events, but across 100s of companies, knowing that the data they received would still be quite sparse and quickly point them to the context they would need to know to reach out to that client or potential client.
If you want to learn more about how Insights can help you as a data professional, déan teagmháil .