Assessing Corporate Health with Machine Learning

Dr. Andrew Duchon

Director of Data Science at Manzama, Inc.

Business development and competitive intelligence both require understanding what is happening with client companies, prospective clients, and even entire industries. One attorney (or any B2B salesperson) might have 10 clients and 90 prospective clients, each of those might have, durchschnittlich, 200 articles a month appearing in the news, noting events taking place with respect to the company, e.g., deals they are making with other companies, executives coming or going, sales or other financial reports, and interactions with regulators. That’s 20,000 articles per month an attorney might need to read to be fully up-to-speed; and that does not even account for general news and industry trends.

Instead of taking this bottom up approach to understanding the big picture, Manzama’s Insights™ uses machine learning to process and organize all that data so that users can quickly get the big picture first, then dive into particular aspects relevant to them. The first question is what are the colors of that big picture? Insights classifies news concerning a company into 25 subfactors which are are grouped into six factors as shown in the first six columns in the table below. The seventh column shows the Ignored factor which contains six kinds of company mentions that may be relevant to understanding a company generally, like conferences and marketing, but are not relevant to their corporate health which will be discussed below.

Insights Factors and Subfactors

Financials Govern- ment Partners & Com- petitors Operationen,,en,Produkte,,en,Dienstleistungen,,en,Verwalten,,en,Ignoriert,,en,Analytiker,,en,Politik,,en,Wettbewerb,,en,Anschläge,,en,Katastrophen,,en,Exekutivbewegung,,en,Konkurs,,en,Verordnung,,en,Angebote,,en,Cyber-Probleme,,en,Führungskräfte,,en,Kriminalität,,en,Steuern,,en,Fusionen,,en,Akquisitionen,,en,Erweiterung,,en,Kontraktion,,en,Produkthaftung,,en,Insider Trans,,en,Aktionen,,en,Stock Nachrichten,,en,Angestellte,,en,Öffentliches Gefühl,,fr,Fehlverhalten,,en,Nicht-Englisch,,en,Lieferkette,,en,Der Umsatz,,en,Inhaber,,en,Kein Ziel,,en,Spam,,en,Empirisch,,en,Wir haben festgestellt, dass alle relevanten Nachrichten über ein Unternehmen in eine dieser Kategorien eingestuft werden können,,en,Subfaktoren und wir haben,,en,Seiten mit Richtlinien, die uns helfen, diese Bestimmungen zu treffen,,en,Wir selbst klassifizieren nicht die hunderttausende Artikel, die täglich in Manzama erscheinen,,en,Hier kommt maschinelles Lernen ins Spiel,,en,Wir haben ein tiefes neuronales Netzwerk trainiert, um die Wörter in einer Überschrift über ein Unternehmen in Zahlen umzuwandeln, die vom neuronalen Netzwerk verarbeitet werden können, um zu bestimmen, welches der,,en Products & Services Manage- ment Ignored
Analyst Politics Competition Attacks & Disasters Geistiges Eigentum Executive Movement Konferenz
Bankruptcy Regulation Deals Cyber Issues Produkt Executives Crime
Financials Taxes Mergers & Acquisitions Expansion & Contraction Product Liability Insider Trans- actions Marketing
Stock News Employees Public Sentiment Misconduct Non-English
Supply Chain Sales Teilen- holders Non-Target

Empirically, we have found that all relevant news about a company can be classified into one of these 25 subfactors and we have 30 pages of guidelines to help us make these determinations.

Natürlich, we ourselves are not personally classifying the 100s of thousands of articles coming into Manzama every day. That’s where machine learning comes in. We have trained a deep neural network to turn the words in a headline about a company into numbers which can be processed by the neural network to determine which of the 25 relevante Subfaktoren werden diskutiert,,en,Das Netzwerk bestimmt die „Wertigkeit“ der Nachrichten für das Unternehmen,,en,positiv,,en,Negativ,,en,oder neutral,,en,Im Algemeinen,,en,Negative Artikel weisen auf „schlechte“ Nachrichten über das Unternehmen hin oder weisen auf einen Rückgang des Unternehmens hin,,en,Positive Artikel weisen hingegen auf „gute“ Nachrichten über das Unternehmen hin oder weisen auf Wachstum hin,,en,In anderen Fällen,,en,es gibt an, ob die Beziehung zwischen dem Unternehmen und einem anderen Unternehmen gut oder schlecht ist,,en,Neutrale Artikel sind in der Regel ein paar verschiedene Arten,,en,Die Güte oder Schlechtigkeit der Nachrichten ist unklar,,en,Die Nachrichten haben keine wirklichen Auswirkungen oder Hinweise auf die Gesundheit des Unternehmens, sollten jedoch nicht ausgeschlossen werden,,en,Die Nachrichten sind sowohl positiv als auch negativ,,en,„Das war gut,,en,aber das war schlecht,,en,Das Unternehmen neutralisiert schlechte Nachrichten,,en.

Außerdem, the network determines the “valence” of the news for the company: positive, negative, or neutral. In general, negative articles indicate “bad” news about the company or are indicative of the company shrinking, whereas positive articles indicate “good” news about the company or are indicative of growth. In other cases, it reflects whether relationship between the company and some other entity is good or bad. Neutral articles are generally of a few different types: the goodness or badness of the news is unclear; the news has no real impact or indication of corporate health but should not be excluded; the news is both positive and negative, e.g., “this was good, but that was bad”; the company is neutralizing bad news; or discussions of an event after it has already occurred. Natürlich, the news can concern more than one company, so Insights analyzes the news relative to each company mentioned. “KPMG overtakes PwC in FTS 100 Audit Market” is obviously good news for KPMG and bad news for PriceWaterhouseCoopers in the subfactor of Competition.

On the whole, this balance of positive and negative news is used to calculate a corporate “Health Score.” We normalize this score from -10 (very unhealthy) auf +10 (very healthy). Most companies range around around 0, which is considered normal. Life happens, bad things happen: but can the company overcome those and keep growing? That’s a typical company. Sometimes everything goes their way. But how likely is that to continue? Andererseits, with enough bad news, a company might go bankrupt, or have their remaining assets merged into another company and thereby disappear. Using Manzama’s historical data, we’re modeling how these possibilities play out.

Besides category and valence, each relevant article is also labeled True or False for having an element of Litigation, Rumor, or Opinion, each labeled independently, which we call Aspects. By Litigation, we mean are the courts or lawyers involved. Rumor refers to actual rumors, leaks, forecasts, or other events that have not definitively occurred. With Opinion, we hope to classify articles as representing a personal opinion (vs. a fact) including speculation, commentary, or otherwise unverifiable information.

After all the news about a company has been classified as to its subfactor, valence, and aspects, the next step is to group the articles together for the user. Of those 20,000 articles mentioned hypothetical above, there is likely a lot of redundancy: there are not 20,000 different events taking place, just re-syndicated or re-worded articles about the same, say, 2,000 Geschehen. Insights helps here as well. On a given day, for a given company, within a subfactor, there is likely to be just one or two events occurring. So Insights clusters those articles into a “stories,” each discussing just one event from one aspect. Im Laufe der Zeit, across days, stories are grouped into “storylines.”

By starting at the storyline level, users can get a big picture and then see how events have played out over time very quickly. To take a recent example, in early December 2018, Google decided to shutdown its Google social network, earlier than anticipated. The AI in Insights connects that news to stories two months earlier to give the user a broader context for this news.

Using Insights, a lawyer working in Cyber Security could be alerted to just those types events, but across 100s of companies, knowing that the data they received would still be quite sparse and quickly point them to the context they would need to know to reach out to that client or potential client.

If you want to learn more about how Insights can help you as a data professional, wenden Sie sich bitte .




Weitersagen. Teile diesen Beitrag!

Seite zuletzt aktualisiert Januar 17, 2019 @ 4:18 pm; Dieser Inhalt zuletzt aktualisiert Dezember 17, 2018 @ 9:12 am