Black Widow repair using Fluke

douglasgb

Well-known member

Donor 11 years: 2012-2022
Joined
Nov 17, 2003
Messages
3,393
Reaction score
485
Location
Santa Monica, California
Got a Black Widow board that has no vector output, and it makes a repeating sound like an old 45 record skipping. Self test does the same thing.

Probing pin 40 of the CPU reveals the board is in constant reset.

Hooked up the Fluke and pressed the key to do a Bus Test, which sees if the CPU can control the address bus lines and the data bus lines. The Fluke says "Active Force Line - Loop?" which means certain pins are forcing the CPU (or pod in this case) into a certain state. Pressing the More key shows "STS BTS 0001 1000" and the legend on the pod itself shows that for the 6502 those Status bits are the IRQ and the RESET lines. Looping won't help; we must tell the Fluke to ignore those Force lines. Do so by pressing the Setup key and then More until you get "Set-Trap Active Force Line? Yes" and then press the No key to change it.

You can then immediately press the Bus test key again and it should run.

On this board we get "Data bit 3 tied High - Loop?" We don't want to loop but we want to know if there are any other problems, so press No. We also get "Data bit 4 tied high" and then a whole series of bit pairs tied (together): 0 and 3, 0 and 4, 1 and 3, 1 and 4, 1 and 2, 2 and 4, 2 and 3, 3 and 5, 3 and 4, and 4 and 5. The last error is "Data bits 6 tied - Loop?" So clearly there are troubles on the data bus.
 
Last edited:
I'm a big fan of using the signature analyzer method for totally dead boards (vectors too as you can load the gravitar roms and run signature analysis). Disable watchdog and strap processor for NOP ($7A Think). Hook clk to PHI2 and start&stop to each of the rom chip selects. Set edges to CLK=/, Start=\ and stop=/. Then work through each of the rom chip selects including the vector roms. Also compare the output of the roms vs the data bus at the CPU socket (ala badd 245).

Once all the roms are good on the databus, I use the fluke to check all the rams.

Once the rams and roms are good. I use the gravitar signature analyzer method (as outlined in the manual/schematics) to check the vector generator.
 
...I use the gravitar signature analyzer method (as outlined in the manual/schematics) to check the vector generator...

Signature analysis is a great idea since the vector generator is beyond the reach of the Fluke. The Fluke pod plugs into the CPU socket, and can see everything the CPU sees. While it can write to vector RAM (to fill it with instructions for the vector generator) and start and stop the vector generator, it is not involved in the actual processing that gets done by the state machine - it's not fast enough. That's why there is a state machine at the heart of the vector generator. It's like having a second CPU.

But first, let's see what else we can learn on the 6502 side of things. Usually when the data bus is shorted it will prevent much else from working, as it is the path into and out of the CPU. Looking at sheet 6A of the Black Widow schematics, however, shows that the data lines from the CPU go to the LS245 transceiver at F2 (to become DB0-DB7) but also branch off before that as pure D0-D7. The DB0-DB7 lines (the B means buffered by the LS245) go to the program RAM at N/P1, the read/write buffers for the High Score table, the Coin Door and Control Panel input/output, and the Vector Memory Data Buffer at P8. Meanwhile, the unbuffered D0-D7 go to the buffer for program ROM (E2) as well as the two POKEYs that read the Option Switch Input and do Audio Output.

The buffers are like valves, and allow the flow of information to be turned on or off (and in some cases control the direction of flow). What this means is that the CPU would use one path to get to all the devices on D0-D7 and another path for those connected via DB0-DB7. Any trouble on the D0-D7 lines will block the path to get to the DB0-DB7 lines, so if the CPU-facing side of E2 or F2 or the POKEYs are shorted we'll be stuffed. So let's test.
 
following, interested and have a question. Wouldn't a logic probe on cpu d3 (pin30) and d4(pin29) would have told you those data lines were tied high? Not questioning the use of the fluke but curious for you to highlight the differences it has made.

Probably alerting that the bit pairs are tied together?
 
...Wouldn't a logic probe on cpu d3 (pin30) and d4(pin29) would have told you those data lines were tied high? Not questioning the use of the fluke but curious for you to highlight the differences it has made.

Probably alerting that the bit pairs are tied together?

Good question. If they were both tied high (or low) then yes a logic probe would tell you that. But as you postulate they may not be stuck in either state, but rather stuck to each other so they rise or fall as a pair instead of independently. That might be hard to sense with a logic probe - though you would see it on a logic analyzer.

Also to clarify my earlier post: a problem on the buffered data bus (DB0-DB7) beyond the F2 transceiver would not necessarily get caught by the Bus test... because the chip enable (pin 19) or the direction (pin 1) may be set in such a way that the DB0-DB7 bus is isolated from the D0-D7 bus. That's one of the reasons why there is a chip at F2: so the connection can be turned on or off as desired.
 
following, interested and have a question. Wouldn't a logic probe on cpu d3 (pin30) and d4(pin29) would have told you those data lines were tied high? Not questioning the use of the fluke but curious for you to highlight the differences it has made.

Probably alerting that the bit pairs are tied together?


Funny you should ask this, as I was going to post anyway.

It turns out I'm able to diagnose about 80% of bus issues with just my LP-560 logic probe (which has the audio function, which I love). I found you can often hear if a given line isn't 'right', once you've probed enough working boards to know what they 'normally' sound like. And yes, if they're just stuck high or low, those are the easy ones. However they're more often shorted to other lines, either internally within a given chip, or externally, from a solder short or other debris between two pins, so they'll often be toggling, just not toggling correctly.

I also have a trick that I'll use to find shorts, where you just check with a DMM. But the trick is to use two adjacent ROMs (or RAMs, depending on which bus it is), and hold one lead on a pin of one ROM, then rake the other lead across the pins of the adjacent ROM (which are all on the same bus, for most of the pins). That way you can buzz through every combination of all pins pretty quickly, and it works surprisingly well.

People spend big bucks on Flukes (and they're good tools, don't get me wrong), but I have one, and I still haven't gotten around to using it, as I just find I don't need it. I'm usually happy to find out that a board has CPU-side issues, as that's the 'easy' category, and can usually probe them out pretty quickly. (And if the above methods fail, I break out the SA, which will definitely find them, but is a little more labor intensive.)

But anyway, the more tools in the arsenal the better, so I appreciate the post. But you can get a lot of mileage out of 'old school' tools and bit of practice, vs breaking out the big guns.
 
Obviously access to a Fluke doesn't make one a good electronic tech anymore than access to a compiler makes one a good programmer. Lots of people have been asking for more tech tutorials on using their Fluke so thanks for taking the time to post this.
 
...you can get a lot of mileage out of 'old school' tools and bit of practice, vs breaking out the big guns.

Absolutely agree - and each tool has strengths and weaknesses. My approach to a repair is to gather information by observing what the board is doing (and is not doing), any info self-test can provide, and some quick probing (also with a LP-560) to know where to start. Then it's a calculation of the cost (in setup time, process time, etc) versus benefit (how much will be clarified) for what to do next.

dorkshoei said:
...access to a Fluke doesn't make one a good electronic tech anymore than access to a compiler makes one a good programmer. Lots of people have been asking for more tech tutorials on using their Fluke so thanks for taking the time to post this.

I'm documenting this repair with a bias for those who do have a Fluke but may not have a lot of experience in using it. I'm sure it won't be the only tool put to use before I'm done.
 
Looking at the data bus (D0-D7) at the CPU, it connects to:

-the 74LS245 at F2. This is a transceiver (transmitter/receiver) that allows data to flow from the left side of the chip to the right side, or the other direction (depending on the state of pin 1). Or, it can isolate the two sides so there is no connection at all (depending on pin 19). If these chips fail they usually do so in a way that data does not flow (as opposed to a short). Not impossible, but I'd say unlikely.

-the 74LS244 at E2. This buffer chip interfaces the data bus to the program ROM chips. It is a one-way chip (since data only comes out from the ROMs). Also unlikely.

-the POKEYs at B3 and C/D3. Because they are socketed, it is easy enough to pull them out and re-run the Bus test. Doing so shows the errors are gone! Reinstalling the top one and trying again shows no errors; reinstalling the lower one brings the errors back. So it's either the POKEY or the socket. Poor socket contact would typically result in the chip itself not working right, and shorted traces to the socket would be shorted whether a chip were installed or not. However, reinstalling the lower 'bad' POKEY in the upper socket does not cause the errors. Hmmm.

Taking a close look at the chip, there is some residual solder on the legs, suggesting this POKEY was pulled from a Ballblazer cartridge so there could be some internal damage if it got overheated. But it seems to work consistently in the top socket. Checking the lower socket reveals the contact in position one is a bit skewed. The schematic shows that's the ground connection for the POKEY chip. Of all the legs, a missing ground might cause problems like the ones we're seeing. I tried to confirm this by bending pin 1 of the good POKEY out of its socket and running the Bus test with just that chip in the upper socket, and got errors on data bits 3,4,5 and 6.

I was able to get the contact back into good shape using a sewing needle. I also cleaned the legs of the chip to remove the excess solder (so it would not re-bend the contact). I also flipped the positions of the POKEYs when I reinstalled them. Now with both in, the Bus test passes consistently. But we'll see how that holds up... may end up replacing that contact in the lower socket.
 
Last edited:
With the Bus test passing, we can move on to testing the RAM and ROM. I've already written a couple of scripts for the Fluke using the development environment from QuarterArcade.com (thanks!) and I have the RS-232/USB interface card from Piero Andreini (thanks!). You can also enter the test commands manually by looking up the addresses and checksums at http://tech.quarterarcade.com/tech/

Black Widow has just a single ROM version, so there's just one link to its MAME Virtual Machine near the bottom of the page. Clicking it shows the ROM regions and checksums, as well as what's happening at the various address ranges the CPU can access. Unfortunately the 'Fluke 9010a Script' generator link does not work.

attachment.php


The other place to turn for Atari games are their excellent schematics. In addition to being legible, well organized, and largely free from errors, they also include a memory map. With these three resources we can put the Fluke to good use.

attachment.php
 

Attachments

  • Screen Shot 2018-03-17 at 4.49.32 PM.png
    Screen Shot 2018-03-17 at 4.49.32 PM.png
    213.2 KB · Views: 180
  • Screen Shot 2018-03-17 at 4.50.02 PM.png
    Screen Shot 2018-03-17 at 4.50.02 PM.png
    562.2 KB · Views: 177
From those two snippets, let's figure out how to test the program ROMs. The Atari memory map counts from high to low and include the names of the functions, while the MAME driver counts from low to high and includes the checksums. Doing a bit of collating tells us that ROM0 is accessed with addresses from 9000-9FFF and its 9010A checksum is 1651. In this case the ROM file uses the Atari part number 136017.101 and if your game board still has the stickers on the ROMs you will find that's at location D1 one the PCB. If you have no stickers then check out sheet 6B of the schematics, find the /ROM0 signal outside the left edge of the Read-Only Memory box, and follow it over to the rightmost of the six vertical rectangles, inside of which you will see D1 for its location. Simple, eh?

Doing the same for all the rest, or by parsing through the "BWGrav.9lc" script on my website we get:

ROM0 or 101 at D1 uses addresses 9000-9FFF and has a Fluke signature of 1651
ROM1 or 102 at E/F1 uses addresses A000-AFFF and has a Fluke signature of 4376
ROM2 or 103 at H1 uses addresses B000-BFFF and has a Fluke signature of 41A9
ROM3 or 104 at J1 uses addresses C000-CFFF and has a Fluke signature of 6D6A
ROM4 or 105 at K/L1 uses addresses D000-DFFF and has a Fluke signature of 0372
ROM5 or 106 at M1 uses addresses E000-EFFF and has a Fluke signature of 1D8C

VROM0 or 107 at L7 uses addresses 2800-2FFF and has a Fluke signature of 6FFA
VROM1 or 108 at M/N7 uses addresses 3000-3FFF and has a Fluke signature of 8C4F
VROM2 or 109 at N/P7 uses addresses 4000-4FFF and has a Fluke signature of 9A8D
VROM3 or 110 at R7 uses addresses 5000-5FFF and has a Fluke signature of F61B

Program RAM at N/P1 uses addresses 0000-07FF

Vector RAM at K7 uses addresses 2000-27FF

A side note: each program ROM spans from 0-FFF addresses, and FFF in hex equals 4095 in decimal (plus the zeroth address, so 4096 altogether). Since each can hold 8 bits of data, it means each ROM can hold 32,768 bits of data. On this board they are TMS2532s. The vector ROMs 1-3 are the same but notice vector ROM0 only goes from 2800-2FFF. Subtract 2800 and you have 0-7FF which is 2048, times 8 makes 16,384 and that chip is an Intel D2716. The two RAMs are also the same size (but read/write of course).
 
Once you have that information, running the actual tests are easy. For the ROM test, press the ROM key and then enter the starting address, the ending address, and the checksum. So, for ROM0, it's:

ROM 9000 Enter/Yes 9FFF Enter/Yes 1651 Enter/Yes

The Fluke will read the data in all 32,768 addresses and do some sort of cyclical mathematical operation on them that is supposed to be able to catch even a single bit that's incorrect. When it's done it will either say "OK" or complain that the checksum did not match by asking "ROM Error @ 9000-9FFF-Loop?" In the second case, you can press the More key to see what the actual checksum was, and sometimes that does help (e.g., a ROM is in the wrong socket).

On this board, all of the program ROMs have good checksums, but all of the vector ROMs have bad checksums.

Checking the RAM is even easier, since there's no checksum. For the program RAM it's just "RAM 0 Enter/Yes 7FF Enter/Yes" (you can omit the leading zeros).

On this board, the program RAM complains that there is a "R/W Error @ 0000 BTS FF-Loop?" Pressing No will advance to the next address, which yields the same fault.

The vector RAM test shows the same error (starting at its 2000 address and going on for as long as I kept advancing).
 
Last edited:
These results are very helpful. Since the program ROMs are good, we know the D0-D7 bus is good. All the bad stuff in on the buffered data bus DB0-DB7. If it were just the program RAM reading bad, then that chip would probably be bad. But everything on the DB0-7 bus is bad, so it's probably that the bus itself is not working properly.

Side note: "R/W Error @ 0000 BTS FF" means that for address 0000, the Fluke RAM test is finding that all eight bits (1111 1111 = FF) of that address are not working. Either the bit cannot be set, or if it is being set a different value is returned when read back. It doesn't really matter since bad means bad, but in this case every bit of every address I had the patience to test had the same failure, which is further evidence it's the bus.

If it were only the vector RAM and ROM, it could be either the LS245 at F2 which is at the start of the bus near the CPU, or the LS245 at P8 at the other end that interfaces with the vector stuff (sheet 7B). But because program RAM is also failing completely, I'm going to guess it's the chip they both use: the LS245 at F2.

This is an example of exactly what AndrewB was writing about: just turning the game on and using a $20 logic probe on both sides of F2 would show that the DB0-7 pins are dead (or stuck in one state). If I didn't have a Fluke with a 6502 pod, or even if I did but didn't have the easy ability to load the script, or if I hadn't written the script, then I probably would have done just that. Plus, the LS245 seems to be in the top 20 for chips that seem to fail.

On the other hand, we now know that all of the program ROMs are good, and as soon as we replace the LS245 we will be able to test the rest of the ROMs and RAM. Plus, testing them in place also tests the addressing that turns each chip on and off, the sockets the chips are in, and all the traces that are involved.
 
Last edited:
Replaced the LS245 at F2 and... no change!

Using the logic probe on the pins of the replacement chips shows that pin 1 is pulsing, but that pins 2-9 (DB0-DB7) are all stuck high. What could cause that? Trouble elsewhere on the DB0-DB7 bus of course, but it's unlikely that all eight of the lines would be stuck high. The other thing that would do this is a function of the LS245 itself... recall that one of the control pins (pin 1) controls the direction of the flow, but the other (pin 19) will prevent any flow of information - electrically disconnecting the two sides of the chip. Sure enough, probing pin 19 shows that it is stuck high, and the datasheet tells us that pin 19 high = all channels disabled.

This signal comes from the chip at R4 which sheet 6A mislabels as an LS00... but it's really an LS32. D'oh! The LS32 is an OR gate, which means it takes two inputs (pins 1 and 2) and if either one is high, it will send a high out of pin 3. So we need both inputs to be low, so the output will be low, so pin 19 of F2 will be low (at least some of the time).

Pin 1 of R4 is pulsing, but pin 2 is stuck high. Because of this, output pin 3 stays high. Pin 2 is the /I/OS signal coming from the custom address decoding PROM at R2. On this board that chip has already been removed and replaced with a socketed chip that's labeled "Gravitar (R1)" The wrong game is not a problem (Gravitar and Black widow use the same board and this PROM is the same for either game). But we're at R1, not R2. R1 is another custom address decoding PROM that does chip select... not necessarily the same as R2. In fact the Atari part number (136010-X11 for R1) is different (136010-X12 for R2) which is a good indication they are not the same.

So, to extend AndrewB's advice about basic tools, even more basic than a logic probe is using one's own two eyes to visually inspect the board, especially if it's already had work done on it.
 
Burned a new 27S19/83S123 PROM with the right code for R2 and installed it and the game runs! In fact, even with the original LS245 reinstalled the game still runs, so replacing that was unnecessary. Oh well. Sometimes you're the spider, sometimes you're the bug.

Since that repair was kind of truncated, let's do another!

EDIT: Actually, there was one more thing: while letting the board run I noticed lines of text did not run straight across the screen. It almost looked right but a close inspection showed every few letters were a bit lower than those before. Normally this could be caused by a BIP pot being out of adjustment, but the lines themselves were meeting up nicely. Visual inspection around the area revealed a broken cap at C71, without which the DAC in charge of BIP and the X and Y voltage references would not be getting any -15volts. I was surprised it looked as good as it did. Replacing the cap made the picture worse because the BIP pots were now way off. Adjusting them helped but things still weren't right. Thinking that it would be "bad" to run the DAC (for who knows how long) with the negative reference input floating, I replaced the DAC08 at D9 and now the picture looks great.
 
Last edited:
If you're taking requests for a topic, one thing that still puzzles me is when you get a RAM error while using the 9010, how do you translate from the bad address reported on the Fluke to the actual RAM chip on the board? Especially if you don't have good schematics to work from.
 
If you're taking requests for a topic, one thing that still puzzles me is when you get a RAM error while using the 9010, how do you translate from the bad address reported on the Fluke to the actual RAM chip on the board? Especially if you don't have good schematics to work from.

Everything is tougher without good schematics, of course. But the first step would be to know what kind of RAM you are dealing with. For example, then 2114 RAM in Asteroids has 1024 addresses (since it uses A0-A9, which is ten bits, or 0000000000 to 1111111111). However, it only has four data lines so each address can only store 4 bits. That's why they are often used in pairs. The same address is set by the CPU and that address goes to two chips, while the eight bits of data are split between them. There's also a "Chip Select" pin which (in this case) must be low for the chip to do anything (otherwise data pins are disconnected). The other control pin is "Write Enable" which (when low) tells the chip whether to ingest and store the bits present on the four data pins, or (when high) retrieve the bits it has stored and set the data pins accordingly.

So if you are really stuck, you could have the Fluke loop a write command on a single RAM address, and then probe the CE an WE pins of each RAM chip to find ones where they are both are low. Then, loop a read command on the same address and look among those you found in step one for the chips where CE is still low but WE is now high. You will probably find two (unless the trouble is with the CE and WE signals themselves, but in that case you'd be having trouble with the entire range of addresses).

To find out which of the two chips is at fault, you'll need to do manual reads and writes using specific data. Assuming the game uses one chip for data bits 0-3 and the other for bits 4-7, you could write 0F to your address (which is 0000 1111) and probe the four data pins of each of the two chips to see which ones are experiencing high pulses, and that will be the chip that handles bits 0-3. To confirm, you can write F0 (1111 0000) and the first chip data lines should all stay low while the other chip's data lines will now be getting the high pulses. Finally, write FF to that address and then read it back. Whichever nibble (upper or lower four bits) does not return F is the chip that can't store the bit. If you do get FF, repeat the test writing 00 to the address, and then seeing which nibble is not zero when read back, and you will find which chip has a bit stuck high.

If you have halfway decent schematics but just no memory map, you can get a head start on figuring this out by looking at the name of the signal that goes to the CS pin. For example, take a look at the schematics for the sound daughter card of Omega Race. It uses one pair of 2114 RAM chips (at K4 and J4) as shown in the attached picture. Pin 18 is shown connected to +5v on K4, but that connection occurs on both chips. Why they show only one for that but show the ground pin 9 for both chips is a good question. The CE pin we're looking for is pin 8, shown only for J4 but also connected to K4. So these would qualify as not great schematics.

Anyway, trace pin 8 backwards and you see it comes through the LS04 at J6 which just inverts whatever it gets from the output of the LS08 at H1. That's an AND gate with Address line 12 as one of its inputs. The other input is from the NAND gate (LS00) at H6. It is monitoring signals that are active whenever there's a memory read or write in progress. The line above the name tells you they are active when low. Trace back further to see how those two signals are built, but it means that the output of H6 will be high when either (or both) of those inputs are low. In turn, the output of H1 will be high when memory is being read or written AND address line 12 is high, which will J6 inverts to output a low to J4 and K4, enabling them. The RAMs themselves are using address lines A0-A9, so along with A12 that means 1 xx00 0000 0000 thru 1 xx1 1111 1111 where xx are the disregarded address lines 11 and 10. Converting that binary to hex we get address space 1000-13FF. Looking up the MAME Virtual Machine data at QuarterArcade, we see that CPU2 (the sound CPU) has matching read-write address space (0-07FF is the sound ROM).
 

Attachments

  • Screen Shot 2018-03-19 at 7.48.41 PM.png
    Screen Shot 2018-03-19 at 7.48.41 PM.png
    306.2 KB · Views: 11
Board #2 has will not boot or run self test. Hooking it up to the Fluke.... no wait, first a visual inspection shows no prior work done on the board, and no apparent damage. Unless you count the fact the capacitor C16 is missing! That goes just below the CPU and is involved in the High-Score table, which I don't think would put the kibosh on the whole board. But of course I installed a replacement.

OK, now the Fluke tells us that the vector ROMs and vector RAM are good, the program ROMs are good, but the program RAM returns BTS errors at every address: mostly FF but sometimes AF or F7 or others, and the results change each time the test is run. Swapping in a known good RAM does not help. Inspecting the socket does not reveal anything obvious. Since everything mentioned above is working, we know the data and address busses are ok. That leaves only the control signals that go the that specific chip: the /WRITE signal going into pin 21, the BANK SEL signal that selects the high or low half of the RAM via address line 10, and the /RAM signal going into the chip enable pin 18.

The plan is to run a looping read of address 0, which should cause pin 18 to be low (because /RAM is active low), pin 21 to be high (because /WRITE is active low and we aren't writing), A0-A10 to all be low (for address 0), and BANK SEL I don't really know (even after reading its description on sheet 9A). The second phase will be a looping write of FF to address 7FF, which should cause pin 18 to be low again (active RAM), pin 21 to also be low (writing), A0-A10 to all be high (address 7FF = 0111 1111 1111) and BANK SEL to do... something.
 
Back
Top Bottom