A microchip

Surviving A Work Project

From time to time, I remember a project that I was on back when I was in Texas over half-a-decade ago.  It worked out in the end, but it was partially due to luck that it did.  I was oh-so grateful when the project worked out and just would not stop thanking God whenever I thought about it.  I still am thankful to this day.

The Group

I was working in a wireless radio group atTexas Instruments.  Our group of around five people was responsible for the software that went on a few of the chips that we owned, and the company sold.  The specialty of these chips was in NFC/RFID.  Cell phones use this technology, badge readers, animal tracking and other areas.

I believe that when I joined the group this project was in progress.  A lead customer asked us to design a chip that they could use in their application because they could not find an off-the-shelf solution.

The project had several groups.  In Texas, an engineer – who was not in our team – apparently since there were no or no free software engineers to handle it, was working the software.  Another person in Germany was handling the hardware portion of the chip.

The software engineer did not want to continue helping us out, so the manager asked me if I could take over her responsibilities – and I decided I would.

I Start Out in the Project

As I started to look over the software stack – there were a few or so thousand lines of code.  The chip had to handle an industry standard protocol (NFC/RFID) that I did not understand, so looking at the code was mostly pointless.

This codebase started out probably a decade ago and left unfinished after a project cancellation.  It was repurposed for this one.

Looking back, I realize, it is always better (if possible) to write something from scratch, instead of taking someone else’s code.  It is easier to understand it that way.

Before this software engineer transferred the project, she decided not to implement patching on the chip.  Patching is being able to modify the software (little portions) after the chip comes out of the factory.  The reason is that this chip was not reprogrammable.  The program exists in ROM (read only memory) on the chip.  And to change the program took about three months of manufacturing and tens of thousands of dollars – after the software engineer sent it over to the factory.

The previous engineer said that if the software had a bug, then the company would have to simply eat the costs of remanufacturing the chip.  Ouch! I thought, hope that doesn’t happen.

She sent over the code; the Germany engineer sent the chip’s schematics, and we waited three months.  After it came back, testing the chip showed that it did not work at all.  Very unusual to have that happen.

They were able to figure out why.  Turns out everything was okay, but a wrong setting was set when setting it out to the factory.

The hardware engineer made the correct setting, and we waited another three months.  I believe that at this point I came unto the project.

The Homeless Man

I was living alone around this time and decided that I should try to help a homeless man out.  Was doing pretty well and decided to start donating 10% of what I received of my paycheck.  It was hard initially, but over time I was able to make it work.  Most of the donations were on autopay, so required little interaction.

I had seen a man that was standing beside big-box hardware store holding a cardboard sign advertising that he could help with construction work.  He was white, tall, well-built, dressed in bib overalls.

As he is standing in the parking lot, on the grass, I approach him and start talking to him. I am not sure how the conversation went – but he is a nice guy.

The actions and times are murky here.  I am helping him out with money.  Probably around 20 dollars a month.  I take him several times to a restaurant.

I give him a white poster board with markers to replace the cardboard one.  It offends me when he tells me that a police officer told him to go back to the cardboard one. 

Another Homeless Man

One Saturday, I am hanging out in his area and a short man, that could pass as being a youth comes around.  He has a black eye and approaches us and says that he was attacked and mugged nearby.  He asks for money.  I give him 10 dollars.

He stays around us and takes out a stack of photographs.  Starts to show me his previous life. He was a successful mechanical engineer.  Shows me a picture of one of his creations – it looks like an airport scanner booth.  He was married had a child.  For some reason his wife was living away from him in Howaii, and he was working on the mainland.

He shows me a picture of him with a woman.  “She was a screamer – she was good” he tells me. “Oh yeah, I was a whore” he continues.

I ask him why he is homeless when he had a good job.  Says that “I just did not care to work anymore after my wife divorced me.”

He put his stuff in storage and started to live on the street. He walks away.

Give Him a Phone

It now occurs to me that I don’t remember the name of the original homeless man.  It may have been Larry, so I will go with that.

Larry comes around.  “Why did you help him? He is lazy and does not want to work.”

I give Larry an old phone that I had and set him up with a plan, that I pay.  I show him this in Chipotle as we are eating.  “Ah, now I can call and get my old job back” he says.

Somewhere around this time, I am moving out of an apartment and into the ranch house that I found for rent.  It turns out that I have a month or two left in the apartment lease that I have to pay for.  Have already signed the lease to the new house that was in demand.

I decide to pay the next few months on the apartment.  I offer Larry to live the apartment until the lease runs out.  He does this.

Chip Round Two

I am sitting late and alone at work.  The second round of the chip has proven more successful than the first one.  However, the customer is reporting a problem with this design.  We are able to simulate how the real chip will work on an FPGA.  So, we can make changes, immediately test it and predict how it will work when it comes out of the factory, three months later.

There is a problem, and I am not able to reproduce it on my side.  I decide that it is due to a faulty FPGA that the customer has and decide to produce chip version three.  The customer agrees.

Larry Calls

Larry calls that night and explains that he is hungry and needs money.  He directs me to a gas station, and I meet him in the parking lot.  He gets into my car.  We talk for a while, and I give him forty dollars.  We separate and I am not sure if I go home or back to work.

Another time he calls me during the day.  He has been working for a contractor on a job site and promised a pay for that day and a ride back.  However, the contractor kicks him out at the end of the day and gives him neither. 

It was hard to find him, but I did.  Gave him a ride back.  I believe it was a hot day and he was really far away to walk back.

Losing Larry

I was debating allowing him to move in with me.  Had a spare bedroom.  I talk it over with him and he agrees.  After thinking it over, and getting feedback from my spiritual guide, I decide not to go through with it.  When I tell him this he asks: “Was your landlord going to charge you a big fee?”.

One day, I realize that I have a spare debit card.  I had moved my money from one bank to another and I had a debit card for the old bank that had no money.  Hmm, I could put 20 dollars a month on this card and give it to Larry.

He takes it and it works well for some time. 

One month I am going through the transactions on this card, I realize that something does not look right.  They are several hundred-dollar charges are going on with it.  Assuming an explanation, I dismiss it. 

Around this time, I have not been able to find Larry at his usual spot outside the hardware store.

Another month goes by, and I check the card again.  At this time, it become clear to me what happened.  These are real charges someone is making on it.  It turns out that I did not realize that the card has a $2000 overdraft limit.

At first, I think that someone has stolen the card and withdrew that money.  This really hurt.  But then I realize it was Larry who did it.  This was much easier to accept.

Apparently when he needed more than the $20 on the card, he tried to withdraw more at the ATM.  It allowed him.  He just kept withdrawing until he hit the $2000 limit.

I call the bank to disable the card.  The lady does this and asks: “Did someone steal your card? If you file a police report, we can have the money refunded to you.”  “I will think about it” I say, and we end the call.

There was no chance that I was going to file a police report on Larry.  I took the money hit; it was about two weeks of pay to me.

Checking the ATM location, Larry took a Greyhound bus to another city in Texas.  I have not seen or heard from him since.

I felt guilty about what happened with him.  If I had not approached him, this would not have happened, I thought.

Chip Version Three

The customer is reporting that they are seeing the same problem on the chip that I had dismissed as an FPGA issue.  Now I am able to reproduce it.

This and a whole lot of other problems with the chip not meeting a standard protocol stack that the customer now states they want it to support.

This mean chip version four.  By the fourth version, the chip has spent about a year in the factory while we waited and a whole lot of money spent on the masks.

Before chip version four, as I was reviewing the code, I came across a line of code that seemed wrong.  I did not understand what it was doing.  Not having the time to dig into it, I decided to make the register reassign able after it came out of the factory.  This was a quick change, and I went about doing other things and forgot about getting back to it.

The Bug

This version of the chip we can confirm that it is supporting the standard and works fine.  Will the customer accept it and we can close out the project, I thought?

I get a call from a tester at the customer.  A tester is reporting that if they continually present a tag to the chip (or an NFC phone), it eventually, after hundreds of tries just stops working.

This is unacceptable. They are asking for a solution.

I get the chip in the laboratory and after a few hours am able to reproduce the bug.

It turns out the bug is well hidden in the code.  I did not write it and kept it from the code that I got from previous developers.

If a packet or a phone is presented within a millionth-of-a-second window the chip goes into a state where it turns off the radio (or NFC) and goes to sleep.  It expects to wake up again when a new NFC packet comes in over the air.  But since the radio is off, it never does.

This state can be corrected by the host processor that is connected to the chip, but the host processor cannot detect when the chip enters this state.

The Solution

As I debug this bug, I see where the bug occurs.  There are about several hundred lines of code until it goes to sleep – forever.  I check every line it goes through; is there some way I can intervene to fix it.

I scan the code and find that the processor goes through the line that I did not understand what was going on and made a way to change it afterwards.

It turns out this mysterious line of code was fine – there was no problems with it – but by some great chance I could repurpose it to turn the radio back on.  It was only partially modifiable, but it was enough for me to turn on the radio.

The line was writing to the radio control register.  Two things about it: it had to be to the radio control register, and it was – this was not changeable.  So, if my patch allowed me to change another register (there are hundreds of registers in the chip) the fix would not work.  What I could change was what was being written to the register.  I just set it turn the radio back on.

This was not perfect, but it was acceptable.

This was the only line of code that I really had control to change in several thousand, and it turned out that I could use it to correct a bug that it was not designed to fix.

I sent my solution to the woman tester.  She comes back later: “It works! I can’t kill the chip!”.

With that, the customer accepted the chip, and it went to mass production.

I was thanking God repeatedly for being able to fix that bug, without having to remanufacture the chip, again.  We had spent so much time – unexpected time – on the project that I was worried we were pushing the customer’s patience limit.

Final Thoughts

I have thought about this in a larger sense.  Perhaps we are as this chip.  Running an unchangeable routine.  Yeah, we get it right 1000 times, but if there is that one time we are in big danger and don’t see it, it would help to have patching.  Regularly asking God or someone else if we are still on the right path and if there are any changes needed.  Because if we don’t, and there is no way to reach us, just like that chip, we may fall asleep and never wake up again.

For more reading, visit the main page at alexsblogs.com

Leave a Reply

Your email address will not be published. Required fields are marked *