What is Fault Detection & Diagnostics?

Definition

Fault Detection and Diagnostics software (FDD) identifies anomalies in the performance of critical equipment such as boilers, chillers, motors, elevators, pumps, exhaust fans, etc. More recent advancements in FDD have enabled software to translate anomalies into real-world faults and deliver notifications to operators detailing not only the root cause of an issue, but how to resolve the problem.

Why it’s Important

Fault detection involves a lot more than just passing a threshold. It’s about having a dynamic understanding of the environment and contextualizing the problem. Instead of alerting operators that there’s been a spike in energy use, the system identifies that the fan belt slipped, the motor failed, the equipment is short cycling, or pumps are not feeding the boiler and the system is going to run out of hot water.

Imagine you have a feedback loop, where when something goes wrong, you get your best engineers to investigate it. They can look at the data, look at the system, figure out the problem, fix it, and put the problem to bed. Fault detection technology bakes that knowledge into algorithms and this situation is reproduced hundreds, thousands, and millions of times, so next time it happens, we know exactly what’s wrong.

In many commercial real estate portfolios, fault detection requires making judgment calls based on incomplete data. In a good scenario, this data is derived from sensors connected to a robust Building Management System (BMS). In the 90% of buildings that do not have a BMS installed, the only datasets available are pieced together from spreadsheets, maintenance logs, and utility bills. These "blind spots" can be filled in with equipment-level submeter solutions.

Undetected faults in equipment can lead to underlying problems. A "fault" does not have to be the result of a complete failure of a piece of equipment. For instance, a problem might be defined as a drift in performance. In commercial real estate, root causes of non-optimal operation might be equipment failures, but problems might also be caused by changed setpoints, schedules, or human error. A fault may be considered a binary variable (”OK” vs. “failed”), or there may be a numerical “extent.”

For example, many traditional fault detection and diagnostics solutions operate based on thresholds. Some equipment is designed to run at all times, and so an alert that a system is no longer drawing any power can drastically shorten maintenance resolution times by directing operators to the right piece of equipment in real-time. The logic here is simple: If power drops below a defined number, then trigger an alert.

On the other hand, equally damaging issues cannot be identified with such simple logic. One example is equipment short cycling, which is when equipment shuts down and starts up in rapid succession. This can be very detrimental to equipment life and wastes energy-related costs. As the length of each cycle can vary wildly, there is no simple logic for identifying this issue; it requires much more sophisticated analysis.

Here is a video outlining how to evaluate fault detection technologies: