02.09.2020
1 min læsetid

From Data Mining to Knowledge Mining – ML-powered search systems

Use Machine Learning to search through millions of unstructured technical drawings in one second? Or how to to match observations to the correct regulation, especially when there are more than a thousand of them and they change multiple times.

IDA

afIDA

From Data Mining to Knowledge Mining – ML-powered search systems and Intelligence

Crayon is a globally recognized and award-winning AI & Machine Learning consultancy with one of the largest competence pool of Data Scientists and AI Experts in Nordics. In this Webinar session 3 real life and applied Machine Learning Use Cases will be presented. The themes in this Webinar is similarity search powered by Machine Learning – autoencoders and transformers (applied to business) and how Machine Learning can be used as process accelerator (through untraditional search services).

Case 1: Searching in 15 million Technical drawings

If you are asked to check the details of a technical document, how do you know whether it is visually similar to a document you or a colleague have inspected in the past? Crayon has built a system that looks through some 15 million technical documents in less than a second on a single on-prem VM, and returns the most visually similar ones. This was done using technologies such as autoencoders, random projection trees and SimCLR – we’ll explain how.

Case 2: Matching observations to regulations

Filing a collection of quality assessment observations can be tedious when it requires you to match every observation to the correct regulation, especially when there are more than a thousand of them and they change multiple times every year. We’ll talk about how we are building a smart search system that uses language models to find the most relevant regulations for an observation, even when the regulations themselves are moving targets.

Case 3: Matching musical works by title and composers

Remunerating composers for the use of musical works requires reports of music use being matched to registered works and rights owners. A smart solution retrieves the correct work even when reporting is not exact and when there are multiple similar versions of the music piece in the database containing millions of entries. We'll talk about how a word embedding based solution enables automatic matching of musical works and how fine tuning of the results can be used to improve reliability of the solution.

Presenters:

Øyvind Storesund Hetland is Data Scientist, PhD and works at Crayons AI Center of Excellence.
Kristin Hatlen Huseby is a Data Scientist with a aMsters Degree in physics working at Crayons AI Center of Excellence