Transparency prevails as T3000 users air experiences

“Guns don’t kill people, people kill people.” While one might argue incessantly about one of the favorite slogans of gun rights advocates, the sentiment behind it applies to what this analyst heard at the Siemens T3000 Users Group Conference held June 8-11 in Alpharetta, Ga. That is, T3000 control systems don’t fail, per se, so much as lines of communication among those responsible for them can. This is CCJ’s second report from the 2015 T3000 meeting; click here to access the first.

In this day and age of corporate paranoia over messaging and little constructive critique and problem solving for the good of the industry outside of closed rooms, the invitation extended to CCJ to participate in the conference is testament itself to Siemens’ goal of greater transparency regarding T3000 implementations and continuing support. Readers should view the issues identified here as punch-list items to review with other T3000 users and Siemens when evaluating existing automation or considering new or replacement systems.

Lessons learned. During the first part of the morning, Pieter Smuts, director of operations, Siemens Energy Inc, reviewed several “project implementation events” and lessons learned.

One involved uneven heating in the intermediate-pressure (IP) steam turbine after an IP control valve was inadvertently shut during warmup and high-pressure (HP) seal steam was improperly routed to the IP exhaust. From the control-system perspective, turning-gear logic was incorrectly applied to the warm-up procedure. The root cause proved to be that a logic error from an earlier unit, commissioned 10 years prior, was not caught when the second unit was commissioned.

Lessons learned: Don’t change things in the field without informing engineers; conduct a critical design review and keep all conclusions in one place; and have a clear assignment of responsibilities.

Another site experienced a phase failure between a B-phase circuit ring and a C-phase main bushing lead. This resulted in adverse temperatures in the generator resulting from erroneous stator and gas temperature readings, and insufficient cooling of the generator stator. From the control-system perspective, the root cause proved to be incorrect temperature readings traced to incorrect hardware proxy settings in the software. This had a serious consequence; the customer was out of operation for several weeks.

Smuts described this as a human-factors incident. Siemens instituted a “quality stand down” at all commissioning sites and enhanced specific automation system testing, conducted internal and external quality audits, then rolled out a new performance training regimen for field engineers.

Lessons learned: Site personnel must validate commissioning procedures; don’t fire people for making mistakes, but do fire them for not following procedures.

A user responded from the audience that this should also be factored into lock-out/tag-out (LOTO) procedures, that a second human being must check everything as part of independent verification.

A third site experienced incorrect fuel fraction delivery to the gas turbine. The adjustment intended for one unit was entered into a running unit. The event was followed by a thorough job safety-analysis review.

Lesson learned: Require separate log-ins for separate units.

Aggravation with Ethernet switches. There was little joy on either the user or vendor side regarding scalance devices, Siemens version of an Ethernet switch, which does appear to be a straight-up hardware, rather than human, issue.

Faris Khalil, head of the Siemens new US R&D center, noted that failures of the “multi-mode fiberoptic Ethernet transceivers involve multiple communications problems, as well as defective soldering of the photo-diodes.” Siemens describe the issue as a generic “open point.” Apparently, the scalance devices are not Siemens products, and are plagued by batch manufacturing defects.

One user exclaimed, “We can’t have failures of these things in the summer,” suggesting these devices are important to uptime. He continued with, “We have more problems with the new system than the old TXP.” Recall that T3000 has replaced Siemens TXP control system at many US sites. Another asked “how can a simple single switch cause a system failure?” essentially confirming their importance. A third user asked if “there is anything plant personnel can do to check their own units?” The response was to make sure spares are on-hand.

However, even spares may not be adequate unless the serial numbers are traced to ensure the devices are not subject to the defect. Khalil responded that Siemens will not wait for scalance devices to fail but will be proactive in getting replacements to customers. A Siemens cybersecurity expert noted that all unused ports on the scalance devices need to be “locked down.”

General grumblings, comments, observations. The following are a laundry list of other concerns expressed either from the podium or the audience:

      • Many users complain about the slow T3000 “work bench,” which integrates onto one screen all views of the plant for operations, modifications, tuning, optimization, and I&C diagnostics (depending on access rights, of course). One noted that “the system works fine for a while, then destabilizes and has to be rebooted.” Navigation buttons slow workbench down, another observed.

      • Servers from 2008 and before are no longer serviced, and upgrades are needed for Microsoft 2012-2016 based users. One user asked “why are you behind in your server development?” Concerns include reliability aspects and performance tests. The response was that it is difficult to change operating systems.

      • The turnaround time between when a customer reports a problem and when Siemens is able to respond received spirited discussion. One commented that some problem reports were submitted a year and a half ago with no response. Another asked if there was a way to track progress and include more user involvement in the process. Siemens responded that it is adding two developers to the R&D group to improve the process.

      • A user observed that few sites use the individual log-in/log-out features, preferring the plant as a whole to remain logged-in at all times. Users are skeptical and don’t care for the “big brother” mentality or need additional oversight. In a similar vein, other users weren’t crazy about search capability and accessibility of online logbooks, implying that the feature isn’t valued, at least at the plant level.

      • There was discussion about ability to turn application nodes on and off remotely. A user asked “what about emergency situations to access the automation data highway directly?”

      • A user asked “how do we do a backup for the entire server?” Others responded that a procedure is necessary for conducting a full backup for disaster-recovery purposes.

      • Issues with OPC (OLE for Process Control) communication represent the largest number of engagements between customers and the Siemens 24/7 remote expert network “hotline.”

      • Finally, one user observed (and several others nodded in affirmation) that Siemens was very good at getting replacement parts to customers but not good at communicating to users about the problem parts.

Posted in Best Practices |

Comments are closed.