While numeric data is most closely associated with the MATLAB environment, MathWorks is aggressively expanding its reach, adding sophisticated text analytics capabilities to recent Release 2017b offering.
The new Text Analytics Toolbox, a product within the semi-annual MATLAB upgrade, is designed to help engineers extract more value from data, specifically unstructured text, which the company claims constitutes roughly 80% or more of all data produced. Specifically, in industries like automotive and aerospace, service and maintenance logs are a critical source of text-based materials, which when combined with machine sensor data and machine learning capabilities, can enable predictive maintenance applications that can give manufacturers a competitive edge.
“This is a new direction in terms of the type of data we’re working with,” says Seth DeLand, product marketing manager for data analytics at MathWorks. “We’re focused on making it so engineers and customers can get more value out of the data they have that’s raw text.”
That data can be found in word documents, PDF files, social media posts, field maintenance logs and reports, and even in digitized equipment logs. In service and maintenance logs, in particular, there is valuable information that could inform future design decisions or aid in preventive and predictive maintenance. Historically, however, it’s been difficult to parse, let alone collect in a single space. If mined effectively, this pool of text data could provide deeper insights into common reasons for why a car comes in for maintenance, for example, or to develop a better understanding of why a particular part failed during real-world operation, DeLand explains. In the same vein, digitized equipment logs are another invaluable source of maintenance insight as they house data and error message logs that could provide insight into operational problems, he adds.
Using the features of Text Analytics Toolbox, engineers can more easily harness that text data to create predictive models. The toolbox imports raw data from all of these sources and brings it into MATLAB, which includes a variety of pre-processing tools and functions to clean up the data for further analysis. Using a variety of techniques, the Text Analytics Toolbox cleans up the text and converts it to numeric data, which can then be used to build predictive models in MATLAB, using its apps and other well-known techniques.
“Engineers can look at maintenance logs and use text analytics capabilities to contextualize the raw sensor data,” DeLand explains. “Then they can design algorithms to predict future failures.”
Unlike domain-specific analytics applications, MATLAB’s text analytics are general in function so they don’t get tripped up by acronyms with different meanings (i.e., PM meaning predictive maintenance, not afternoon) and because the MATLAB high-level language and Integrated Development Environment make it more accessible to non-experts familiar with the MATLAB syntax, DeLand says. The MATLAB environment also makes it easy to integrate other sources of data with the text information such as equipment IDs, sensor readings or weather data.
Watch this video to get an overview on the highlights of MatLab R2017b, including the new text analytics capabilities.