Tuesday, May 12, 2015

A Journey of a Thousand Failures

In the course of a year or so, the blower belt in the outdoor HVAC combo unit had developed a prolonged squeal at startup... or rather, the squeal it had when new became obnoxious and couldn't be ignored.  Opening the unit up to adjust belt tension revealed a few problems that turned into a months-long journey of failures and progressively complicated half-baked attempts to fix things on the cheap.

The first problem I noticed that it was impossible to tension the belt for a number of reasons related to design.  The motor is bolted to a sliding mount with slots on either end.  Small 1/4" (maybe the originals were M6, i don't remember) bolts fit through the slots and into captive nuts in the chassis as shown in the post-repair picture.  The primary problem with this is that the slot is over 3/8" wide.  The naked little bolt head actually gets sucked into the slot if there isn't a substantial washer under it (which there wasn't).  Likewise, the captive nut gets sucked into the slot and binds up the bolt without actually clamping the bracket tight.  To make sure this collage of randomly selected dimensions gets turned into a proper clusterfuck, they decided to use captive nuts made out of something resembling metallized cheddar cheese.  I barely had the thing past "snug" before the nuts were turned into captive washers. I ended up tapping some captive nuts from 2" chunks of 1/2" or so bar stock.  The bolts are grade 8 and have some heavy 1/8" flat plate spreaders with keys to fit the slots. I was going to use larger bolts, but for some reason, I didn't care that much.  Imagine that.

Motor mount (post mod), showing oversize slots, keyed plates, and belt chowder everywhere

Once things were able to be tightened, I was humorously and expectantly disappointed to find that the motor was still able to smoke the belt at startup.  Now let's step back a minute and take a look at the blower system.  It's a simple belt-driven blower on a short (~20 feet) duct run.  The motor is a Marathon three-phase anomaly.  It's one of those cases where the manufacturer provides the motor to the OEM with no HP rating on the nameplate.  They do this for air compressors so that the OEM can slap any made-up fantasy bullshit rating on the equipment sticker; I imagine this is probably a similar case.  Bullshit aside, nameplate current says its a 2HP 1750 RPM 240/480v motor.  It's probably a bit oversize for the installation, but I don't have a suitable replacement.

The original pulleys were a ~3" adjustable on the motor and a 4" plastic sheave on the blower. I'm not going to go back and look it up, but my books say that's marginal at best if trying to deliver torque to the inertial load at startup.  It's pretty much built to slip the belt at startup.  After swapping out for a pair of sheaves that would comfortably transfer power without slipping, the reason why became a bit obvious.  The motor must either slip the belt, or it will rock the entire unit as it launches the blower assembly up to speed in under 170 milliseconds.  I've already had one of these crimped-together blower wheels come apart in service on the prior unit.  I am sure that sort of shear stress would be particularly good at popping them apart.  Can the motor be tamed?

I'm way too broke to buy a decent soft-start unit, and I didn't see anything appropriate on Ebay at any point along the way, so the idea of a drop-in solution is right out.  Can I do a thyristor chopper with a simple ramped firing angle?  Not that I can figure.  The wye-connected motor and open-delta service means that the wye point of the motor can't be grounded.  Without a path for neutral currents, no phase can be triggered in isolation.  It turns into an ugly mess of either having a severely limited range of available firing angles, or other unsavory prospects.  I'm certainly not going to go full-on DC-link and inversion.  It would be too expensive to build anyway.  I had initially decided to attempt a 50% undervoltage start scheme similar to wye-delta starting, simply using the fact that it's a 240/480v motor.  I decided that this was the safest route since I could test that the motor could develop sufficient torque in this manner prior to actually purchasing parts.  Resistor starting might involve more testing and unknowns in my assessment. 

Totally wrong resistor footprints
BTA26 triacs and excessive trace reinforcement

The basic building (and stumbling) block of the attempted solution is a three-phase solid-state relay.  I figured I could make my own SSR's for cheaper than I could buy them.  I'm sure I cut a few corners, but unlike the case of a shady chinese ebay SSR, at least I have a vague idea what those corners might be.  To sequence the three relays required for an undervoltage starter, I put together a little overbuilt control timer and relay driver.  The timer could have been done simpler, but the simple transistor-transistor delay circuits I had tried were a bit unreliable at enforcing a minimum deadtime between outputs.  I ended up using a microcontroller to provide a pot-selected 0.5-3 second delay with a 100ms deadtime.

Mostly junk parts
Microcontroller, dual-channel mosfet, regulator

I did my best to blindly do snubber design for the standard (BTA26, B-suffix) triacs, but without being able to better characterize the load, I may have done a poor job of it.  Whether the issue is dv/dt, overshoot, or fake Ebay triacs, I can't tell anymore.  Despite working well across all the bench tests, two of my test relays incinerated themselves when the start relay re-triggered when the other two were online.  Disregarding the nuances of what initiated it, a relay collision in the three-relay undervoltage-start configuration results in a direct line-line fault.  I can't expect the relay to survive after that.

After ages of self-deprecation and alcohol, I decided that I wasn't going to rebuild the destroyed parts and attempt to fix the triggering.  I contemplated the cost of letting belts burn up and weighed it against my inertia.  Somewhere I got angry enough to throw some numbers at a resistor-start solution with the remaining parts.  The result:

Resistors mounted where the other relays were

Using the following ballast resistor app note, I was able to get a starting point for appropriate resistor values (what I had expected to require some trial and error).  Using my crude attempt at a motor model, I found that the recommended 6 ohms/phase was indeed quite reasonable.  Given the availability of cheap shitty chinese power fuses resistors in only particular values, I settled for 8 ohms.  Splitting the resistance across two 100W packages gives me sufficient margin to hopefully account for the questionable durability of the resistors.  The average power delivered to each resistor should be around 200W (my meters are too slow for me to measure shit), whereas the rated overload power in the maximum 3 second start delay is 833W.  At this point, the timer is severe overkill.  There's no point in the accurate timing and dual outputs.  The remaining relay has been cleaned and had fresh triacs installed.  Everything is greased and mounted to a 5"x0.25" aluminum plate.

So far, this configuration works well.  The start is a bit slow (about 2.5 seconds), but I was using 8 ohm/phase instead of 6.  The transfer is smooth, and there is no risk of collision or turn-off stresses (since this is downstream from a mechanical contactor).  In the end, the problems and my inability to locate the causes of failure on a zero-dollar budget with minimal tools leaves me unable to be proud of any part of this venture.  I'm just glad it's over for the moment.  I'm only writing this bullshit blog because I have literally nothing else to do other than sleep. I have no shame.  That's the primary reason I'm here. 

I had originally planned to prepare complete documentation with drawings, sim files, and a BOM, but at this point I can't be bothered.  I slapped some notes on the eagle files and tossed everything in a zip file.  If you dare to try building the relays or adapting them, try out the following app notes:

Use and protection of triacs:

Using & snubbing optotriacs:

Direct gate drive circuits:

Friday, May 1, 2015

Hybrid IC's always make interesting failures

In the course of troubleshooting one of my monitors (Panasonic PanaSync P21), I came across this unexpected component fault.  The monitor had been exhibiting gradual brightness degradation for months, which I had not really noticed until the fault became dramatically asymmetric.  Within a few weeks, I lost 60% - 80% of the green output and a large chunk of red.  Due to the weight of the monitor and the fact that it was in a rack, bench troubleshooting methods were not exactly an option.  The problem was isolated with a few basic checks.  The input mixer board connects to the CRT neck board via three coaxial jumpers.  Swapping channels indicated that the fault was in the CRT or in the cathode driver circuit.  The cathode driver circuit consists of one mixer/amplifier IC for each of three channels, plus a single cathode driver IC mounted on a heat sink.  A cursory check of all the associated capacitors left the problem buried within the realm of the four obscure IC's or the CRT.  At this point, it can't be fixed.  I popped the CRT board back into the monitor and figured I'd use it until I could get someone to give me a hand wrestling glass.  Within a few hours, the green and red output continued to drop noticeably until the red channel began to flicker between 0 and 100% before dying completely with an audible snap.  It's incredibly difficult to read dim cyan on black for hours.

After eventually swapping the monitor with a spare, I stripped it for disposal and salvage.  I took another look at the cathode driver board just to see if the audible failure came from any of the IC's.  I clipped the main driver IC off the board and decided to take a closer look.

SHW3528 cathode driver hybrid IC (leads clipped off)

Right away I became suspicious.  With my terrible eyesight, I hadn't noticed before that this wasn't a typically encapsulated package, but a custom hybrid IC.  I know it's only anecdotal, but I doubt I have encountered such a device that wasn't a problem or didn't experience a packaging-related failure (shakes fist at those goddamn STK3122's).  The device was mounted with heat sink compound, though the compound was rather petrified.  I popped the cover off and had a look inside.

Snapped substrate and fractured attachment solder

The ceramic substrate broke when I removed the cover.  This pretty much reveals the root cause of failure.  From left to right, the associated channels are green, red, blue.  The progressive degradation was due to a failure of the substrate attachment to the thermal tab.  This appears to be a reflowed high-tin solder, and was clearly fractured and corroded.  The only part of the substrate remaining firmly attached to the thermal tab is the blue channel.  As a rough guess, I'd say that there was maybe 20W or so intended to go through that thermal interface.  I can only imagine how hot the substrate and junctions were operating

That's your thermal path.  No really!

That is some ugly shit.

Although by the time I bothered to take pictures, the bond wires had seen a thumb or two, the dice in the red channel are clearly marked with craters indicating their failure.  Whether the sudden failure of the red channel was induced by the board removal and reinsertion is unknown but likely.

Two rightmost dice of red channel showing (blurry) cratering

So there, the substrate attach had fractured and the thermal path was compromised causing an asymmetric degradation of the device channels.  It can't be seen from the photos, but the plating on some of the leads is coming off and the metal is heavily tarnished underneath.  It's easy to just say that thermal cycling caused the fracture, but I'm rather certain that there are some issues of metallurgical nuance to be blamed as well.