Our process in details
View and download below our process flow diagram:
View and download below a full PDF report from one of our past projects:
Our methods in details
All our methods were developed as part of multiyear research projects and have led to numerous scientific publications in top-tier international peer-reviewed journals. They all have been validated on the large-scale datasets of numerous clients from the construction, mining, industrial, oil and gas, and electrical domains, with injury, near miss, and hazard observation free text reports.
In addition to a detailed report (see above), the output of all our analyses are accessible via two interactive web applications: the desktop app, for office use, and the mobile app, for onsite use (think of a pre-meeting tool for safety managers).
We can also develop APIs to give you programmatic access to your analyses and models, if you need to use your own user interfaces.
Watch the mobile app in action in the videos below:
Read full details about our methods in the following sections.
The attribute framework for construction safety
The attribute framework forms the basis for some of our most powerful methods. It allows any construction situation to be uniquely and comprehensively described by a finite number of basic attributes. Attributes pertain to construction means and methods, environmental conditions, and human factors. Some examples include welding, working overhead, wind, object on the floor, powered tool, confined workspace, improper procedure...
Crucially, all attributes are observable before accident occurrence. This means that attributes can be used to understand under which conditions do injuries occur and predict them. For this reason, attributes are often referred to as precursors.
Our natural language processing (NLP) tool can extract with high accuracy more than a hundred attributes, energy sources, and some safety outcomes (body parts, injury types..) from unstructured injury, near miss or observation narratives. It can process databases of hundreds of thousands of injury reports in minutes. It needs no training data and can thus be applied to small or large databases with the same accuracy. This also means that it can be updated quickly and re-run as many times as necessary, for instance, to add new, custom attributes.
Applications are numerous and include:
standardizing and indexing an injury report database to be queried by a search engine
understanding and visualizing trends and patterns in an injury report database
by combining attribute counts, severity and exposure data, modeling and simulating safety risk
using attributes as precursors to predict safety outcomes with machine learning
Note that some of our coolest functionalities, like the smart search engine or automatically learning precursors from text, are not based on the attribute framework. Watch all videos below for more!
Diagnostics and visualization of construction injury report databases
Once attributes and outcomes have been extracted from your database with our NLP tool, the first step is to visualize and understand basic statistics and trends. For the variables extracted by our NLP tool of course, but also for all your variables. You can do that in our web app, thanks to a unified filtering interface and various tools and visualization. This allows you to get a quick and very visual overview of the major safety-related issues that your organization is facing. Watch the videos below for more!
Part 1/3: filtering interface, basic stats visualization, time series injury count modeling and forecasting, and report inspection.
Part 2/3: word co-occurence networks, keyphrase extraction, report ranking, word clouds.
Part 3/3: attribute co-occurrence networks.
Construction safety risk modeling and simulation
By combining attribute counts, severity levels, and exposure values (gathered through a quick online survey that we send out to relevant people at your organization), we can compute risk at the attribute and situational levels. We also use statistical methods borrowed from climatology and extreme value analysis to model and simulate safety risk, allowing for more robust interpretation of the risk scores.
With the risk analysis functionality, you can answer questions such as: (1) how risky is a particular work package? (2) what is the distribution of risk in time and space on a particular site (3) on which aspects of a given task should the JHA/safety meeting/toolbox talk focus?
Read our paper!
Construction Safety Risk Modeling and Simulation (Risk Analysis, 2017)
Construction injury prediction with machine learning
We can train machine learning algorithms to predict the safety outcomes already present in your dataset given the attributes extracted by our NLP tool. Examples of predicted safety outcomes include body part, injury severity, and injury type. After careful training and evaluation, the models are deployed and made accessible to you via an interactive web app, where you can enter attributes and get predictions. The counts over time of the selected attributes are also shown, along with a risk estimate and the historical reports matching the selection. This is very useful, e.g., to prepare a work package in the office or illustrate a toolbox talk onsite.
This part allows you to answer the following questions: (1) given a description of a construction situation in terms of attributes, what are the most likely outcomes, should an incident occur? (2) which attributes are the most predictive of a given outcome category? (3) what historical cases match the attribute selection? (4) which attributes are trending? Watch the video below for more!
Read our paper!
AI Predicts Independent Construction Safety Outcomes from Universal Attributes (Automation in Construction, 2020)
Binary and semantic text search engines for construction safety report databases
The ability to search an injury, near miss, or observation dataset with natural language queries is a must-have. We offer two complementary search engines to enable you to find exactly the reports you are looking for.
First, the binary search engine supports complex word and phrase queries and combinations thereof (or, and, all, all in order, etc. operators).
With the semantic (or smart) search engine, on the other hand, you can enter a completely free text query of any length. The engine understands the meaning of it by capturing synonymy and semantics and retrieves reports depicting similar situations (same environment, same outcomes), but that do not necessarily share any word with the query. This functionality is great for finding reports sharing a common theme or concept in large databases without having to know and manually enter all synonyms of a keyword or keyphrase. In a way it's like query expansion is done implicitly.
And of course, the reports can also be filtered based on your variables or the variables extracted by our NLP tool (attributes and outcomes). Watch the video below for more!
Learning construction injury precursors from raw textual safety reports
Injury precursors or causal factors can be automatically extracted from raw text reports as a by-product of training models to predict various independent variables, including, but not limited to, safety outcomes, such as injury type, injury severity, body part injured, etc.
Our methodology relies on attention-based deep learning models (convolutional and recurrent neural networks with attention) and more traditional supervised machine learning models (TF-IDF + SVM).
We always adapt our code and models to make sure we always stay at the bleeding-edge of progress in AI and NLP.
In addition to the learned precursors, our suite of tools allows you to get the predictions of the models for any natural language query (e.g., free text description of the work environment), and to visualize the attentional decisions of the models and their internal representations. Such visualizations are very helpful in understanding the predictions.
Watch the video below and read our paper for more!
Read our paper!
Automatically Learning Construction Injury Precursors from Text
(Automation in Construction, 2020)
We are actively conducting research on the following exciting new topics. Stay tuned!
Automatic hazard and energy recognition in construction photos and videos
(Under development). In this project, we apply state-of-the-art artificial intelligence to detect workers, vehicles, energy sources, safety hazards and controls in photos and videos of construction sites. This project will have numerous applications in the safety and productivity domains. Some examples include live identification of safety hazards, live monitoring of the number of people in specific areas onsite, detection of lack of PPE, indexing, organization and smart search of large databases of construction photos and videos, and more.
Watch the video below for some early results!
Automatic safety meeting minute generation
(Under development). Here, we develop a series of tools to automatically generate meeting minutes from speech-to-text transcriptions, and prefill safety forms, such as JHA. More soon!
Our tech stack
We use Python and R heavily, sometimes mixed in the same script! To train our models, generate our reports and develop our applications and APIs, we rely on awesome libraries such as TensorFlow, PyTorch, Scikit-learn, R Markdown, Shiny, Flask, and Plumber.