Machine Learning: An Overview Pt.2
Machine learning (ML) is an emerging field that attracts a great amount of interest, but is not well understood. This blog post expands on the ideas discussed in the previously published blog post “Machine Learning: An Overview Pt.1” which presents an overview of ML principles and applications in FAQ form. In Pt.2, we provide insights into why and when ML should be used, how an ML algorithm is trained, and potential areas of application for the AEC sector.
Why should I use ML?
The key benefit to considering ML is that it can be applied to a wide variety of problems. Applications can include basic tasks such as simple corporate activities, and highly complex tasks such as recognizing and predicting trends. Examples of some tasks where ML has been used include:
- Employee Hiring: Reviewing resumes to shortlist candidates.
- Finances: Matching invoices natural language processing.
- Predictive Maintenance: Predicting and detecting anomalies in infrastructure to prevent disruptions (e.g. the condition of a concrete bridge).
- Product Recommendations: Using purchase history to recommend products to consumers.
- Computer Vision: Identifying and categorizing objects in images and videos.
When should I use ML?
The question of when to use ML is complicated, as developing an ML solution requires significant investment in time and resources. In order to develop an ML solution that creates value, the following questions should be asked. If the answer to any of the following questions is “no”, ML is unlikely to produce significant benefits:
- Do you have a good understanding of the problem that must be solved?
- Is the problem you are trying to solve a recurring, repetitive problem? Is it a scalable and/or transferrable problem?
- Is ML expected to save a significant amount of time and/or resources?
- Is there a large dataset available to train the software? If a dataset is not readily available, is it relatively easy to obtain an appropriate dataset?
How do you “train” ML software?
An ML algorithm can be thought of as a person learning a new task. At first, a person learns how to complete a task through training and then, by carrying out the task, the person gains experience. This experience gives the person the ability to complete more complex tasks, and to complete tasks more quickly and effectively. Similarly, an ML algorithm must be trained by giving it vast amounts of data. The ML algorithm then artificially “learns” how to complete the task, which allows it to “understand” how to complete similar tasks. The four primary methods of training an ML algorithm are as follows:
- Supervised learning: The algorithm is given inputs and the corresponding correct outputs (i.e. “labeled data”). The algorithm then calibrates itself based on its actual output and the correct given output. Example: Fraudulent credit card behaviour.
- Semi-supervised learning: The algorithm uses a mix of labeled data and unlabeled data for training (i.e. a mix of correct outputs and incorrect outputs), and the algorithm must figure out the “right” answer. This is typically used when labeled data is expensive to obtain. Examples: Classification, regression, and prediction.
- Unsupervised learning: The algorithm is given unlabeled data and must recognize trends within the data by itself. Example: Product recommendation algorithms.
- Reinforcement learning: Through trial and error, the algorithm learns which actions generate the optimal results. Examples: Robotics, navigation, and video game AIs.
What should I consider before using ML?
Given a sound understanding of the required data and the problem that is to be solved using ML, the following items should be considered:
- The appropriate type of algorithm to be employed in order to solve the problem (regression, decision tree, clustering, etc.).
- The type of algorithm training to be employed. This will largely depend on the available dataset, and the type of algorithm used, as some types of algorithms require significantly more data, while others require very high quality data (i.e. no outliers or “noise” in the data).
- The expected correlation between different parameters within the dataset and the expected output (i.e. after the ML algorithm is run, what do you think the relationship between the data will look like? Is this different than what you expected a human user would find using non-ML methods?)
How can machine learning be applied within the architecture, engineering, and construction (AEC) sector?
Intelligence:
- General business operations: Hiring, finances, opportunity tracking.
- Transportation analytics: Transit routing, signal timings, delay/travel time predictions.
Buildings:
- Intelligent building platforms: Energy usage predictions, heating and lighting automation, system anomaly detection.
- Dynamic building envelope design: Tracking air, water, heat light, and noise transfer between a building’s internal environment and the external environment.
Infrastructure:
- Surveying: Data collection, processing, and analysis.
- Identification of potential development sites: Property value prediction, land use patterns, development applications.