Reverse engineering the XE-A207 cash register file format

contents: So Far 2013-08-18 Discovering xxd 2013-08-20 Looking at offsets, some other quick things 2013-08-25 Thinking about conversion 2013-10-13 A minor DB detail 2013-10-14 iconv syntax, problematic transactions 2013-10-20 Bash script, the beginnings of usefulness 2013-10-27 Revised bash script 2013-12-29 A note about PLUDT.SDA 2014-01-04 A few bash script additions 2014-04-19 More bash tweaks 2014-05-15 Data file byte offsets 2014-05-15

So far 2013-08-18

So my partner and I have a small business, and only accept cash. We have a cash register that does a fine job-- it adds things up correctly and gives us totals at the end of the day which more or less usually match what is in the register. We thought it would be fun to collect more data though, so we got a Sharp XE-A207 which can save all your sales data to an SD card. We were hoping it would be in some kind of unobfuscated format and as it turns out it mostly is. We were also hoping that the program data could be backed up and possibly modified via the SD card and it appears that this is probably possible as well. In all honesty, "reverse engineering" is probably a somewhat overblown description of what we're trying to do, but it's the best I could come up with. I should also probably mention that I have next to no idea what I'm doing here, so if I use incorrect terminology or what have you that is why.

Oh! Also, forgot to mention that Sharp does in fact provide a program to do just what we want, but it's Windows only. So no go on that.

So the goals here (roughly in order of importance/easiness) are:

Sales data is stored in some kind of nonvolatile memory which you can then copy to the SD card. It's stored in a file called EJFILE.SDA (EJ = "electronic journal", from when registers had a "journal tape" which kept a running log of every transaction) which appears to be a relatively straightforward text file. It is in MS-DOS Latin US though, which makes some characters display strangely. A sample follows:

#000059  2013/08/17 14:50:00  
01 bubo                 000666
1@ 3.50                  $3.50
drip coffee 0.4l              
1@ 3.00                  $3.00
190 kit                       
1@ 3.00                  $3.00
cafÇ au lait                  
1@ 3.00                  $3.00
short cappuccino              
1@ 0.50                  $0.50
add jellies                   
1@ 4.00                  $4.00
affogato                      
1@ 2.50                  $2.50
milk tea                      
1@ 1.00                  $1.00
wholesale bakes               
1@ 1.00                 ¿$1.00
misc taxable                  
ITEMS 9Q                      
 C A S H           $ 2 1 . 5 0

The "Ç" in "cafÇ au lait" is supposed to be a "é" obviously, it's represented by \x82 in a hex dump of the file. The "¿" on the last line is \xC0 for the record. What you see above was copied and pasted from TextWrangler after just changing the extension of the file to .TXT, the hex codes are from using the "Hex Dump File..." menu command in TextWrangler.

All in all I don't expect too many problems massaging the EJ into a SQL friendly format, although suggestions or hints would be gladly appreciated. I am also acutely aware that what I am planning on doing might not be the best course of action, but YOLO &c.

The program data is where things start to get a little (more) hairy. Roughly, price information is stored as department info or PLU info. Dept info is supposed to be for categories of items (dairy/seafood/bakery/etc) and PLU info is for specific items (quart milk/salmon head/cupcake/etc) which then belong to departments. For the sake of expediency we have most of the department keys programmed to be specific items, or another way to think about it is that we have a "cappuccino department" which could theoretically have a number of items assigned to it. It's just the way the buttons are set up, don't ask me why that is. For some things it makes more sense to set up PLUs though, like items that we have in stock so infrequently they don't merit having a button all to themselves. Specific coffee beans, baked items, and so forth get PLUs. A hex dump of the PLU file (PLUDT.SDA) looks like this:

                                - dept
                               |   - type
                               |  |
                        [plu] [] []             [$$$] [name                                         ]
0000: 00 00 00 00 00 00 00 01 15 06 00 00 00 00 02 50 73 63 6F 6E 65 00 00 00 00 00 00 00 00 00 00 00 	¿¿¿¿¿¿¿¿¿¿¿¿¿¿¿Pscone¿¿¿¿¿¿¿¿¿¿¿
0020: 00 00 00 00 00 00 00 02 15 06 00 00 00 00 02 50 62 72 65 61 6B 66 61 73 74 20 63 6F 6F 6B 69 65 	¿¿¿¿¿¿¿¿¿¿¿¿¿¿¿Pbreakfast cookie
0040: 00 00 00 00 00 00 00 03 15 06 00 00 00 00 02 00 75 73 64 32 20 63 6F 6F 6B 69 65 00 00 00 00 00 	¿¿¿¿¿¿¿¿¿¿¿¿¿¿¿¿usd2 cookie¿¿¿¿¿
0060: 00 00 00 00 00 00 00 04 15 06 00 00 00 00 03 00 6D 75 66 66 69 6E 00 00 00 00 00 00 00 00 00 00 	¿¿¿¿¿¿¿¿¿¿¿¿¿¿¿¿muffin¿¿¿¿¿¿¿¿¿¿
0080: 00 00 00 00 00 00 00 05 15 06 00 00 00 00 02 50 63 6F 66 66 65 65 20 63 61 6B 65 00 00 00 00 00 	¿¿¿¿¿¿¿¿¿¿¿¿¿¿¿Pcoffee cake¿¿¿¿¿
00A0: 00 00 00 00 00 00 00 06 15 06 00 00 00 00 01 50 6D 61 6C 61 73 61 64 61 00 00 00 00 00 00 00 00 	¿¿¿¿¿¿¿¿¿¿¿¿¿¿¿Pmalasada¿¿¿¿¿¿¿¿

All that stuff at the top in brackets is notes, dump starts with "0000:". So line one is [some zeroes] ... [plu (0001)] ... [department (15, baked stuff)] ... [sale type (6)] ... [more zeroes] ... [price ($2.50)] ... [name (scone)]. Pretty straightforward I guess. I should probably mention that all the program files (eg PLUDT.SDA) have companion files that I have not figured out in the least. They bear the extension .FDS (eg there is a PLUDT.FDS for example) and I don't know if they're a checksum or datestamp or what. Here's that the dump of PLUDT.FDS looks like:

0000: 11 00 09 C4 00 00 00 01 01 08 04 00 04 10 00 00 	¿¿ΔÄ¿¿¿¿¿¿¿¿¿¿¿¿
0010: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 	¿¿¿¿¿¿¿¿¿¿¿¿¿¿¿¿
0020: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 	¿¿¿¿¿¿¿¿¿¿¿¿¿¿¿¿
0030: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 	¿¿¿¿¿¿¿¿¿¿¿¿¿¿¿¿
0040: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 	¿¿¿¿¿¿¿¿¿¿¿¿¿¿¿¿
0050: 00 00 00 00 00 00 00 00 00                      	¿¿¿¿¿¿¿¿¿

Department data files get even weirder. There are 99 possible departments but we're only going to be using about 25 of them. Here's what the dump looks like:

00000000  01 43 00 99 99 99 99 00  00 00 00 03 50 69 63 65  |.C..........Pice|
00000010  64 20 63 6f 66 65 65 00  00 00 00 00 00 02 43 00  |d cofee.......C.|
00000020  99 99 99 99 00 00 00 00  02 50 61 6d 65 72 69 63  |.........Pameric|
00000030  61 6e 6f 00 00 00 00 00  00 00 03 43 00 99 99 99  |ano........C....|
00000040  99 00 00 00 00 02 50 65  73 70 72 65 73 73 6f 00  |......Pespresso.|
00000050  00 00 00 00 00 00 00 04  43 00 99 99 99 99 00 00  |........C.......|
00000060  00 00 02 50 64 72 69 70  20 63 6f 66 66 65 65 00  |...Pdrip coffee.|

So the lines should actually be 29 bytes long... Here's what it looks like after a little massaging:

                                 [$  ] [i  c  e  d     c  o  f  e  e                 ]
01 43 00 99 99 99 99 00 00 00 00 03 50 69 63 65 64 20 63 6f 66 65 65 00 00 00 00 00 00 
02 43 00 99 99 99 99 00 00 00 00 02 50 61 6d 65 72 69 63 61 6e 6f 00 00 00 00 00 00 00 
03 43 00 99 99 99 99 00 00 00 00 02 50 65 73 70 72 65 73 73 6f 00 00 00 00 00 00 00 00

And yes I'm aware that I misspelled "coffee" up there. It's been fixed in the register.

So now all that remains is to figure out a way to generate these binary files and we should be good to go! Piece of cake right?

Discovering xxd 2013-08-20

Oh wow it looks like there's a Unix/Linux utility called xxd that does back-and-forth binary translation, pretty cool. Here's some output:

0000057: 04 43 00 99 99 99 99 00 00 00 00 02 50 64 72 69 70 20 63 6f 66 66 65 65 00 00 00 00 00  .C..........Pdrip coffee.....
0000074: 05 43 00 99 99 99 99 00 00 00 00 03 75 63 61 66 82 20 6c 61 74 74 65 00 00 00 00 00 00  .C..........ucaf. latte......
0000091: 06 43 00 99 99 99 99 00 00 00 00 03 50 63 61 70 70 75 63 63 69 6e 6f 00 00 00 00 00 00  .C..........Pcappuccino......
00000ae: 07 43 00 99 99 99 99 00 00 00 00 03 00 63 6f 72 74 61 64 6f 00 00 00 00 00 00 00 00 00  .C...........cortado.........
00000cb: 08 43 00 99 99 99 99 00 00 00 00 02 75 6d 61 63 63 68 69 61 74 6f 00 00 00 00 00 00 00  .C..........umacchiato.......

This was generated by issuing

xxd -c 29 -g 1 ~/DEPTDT.SDA > ~/DEPTDT.SDA.txt

Pretty exciting! Cautious optimism here.

Looking at offsets, some other quick things 2013-08-25

Looks like the .SDA files all have arbitrary offsets. Here's a quick applescript I threw together to hex dump them with the correct offset values-- the ones I could figure out anyway.

on run
	display dialog "drag! drag! ok!" buttons "okok" with icon caution
end run

on open filelist
	repeat with i in filelist
		set pi to quoted form of POSIX path of i
		--get basename
		set filename to do shell script "basename " & pi
		if filename = "PLUDT.SDA" then
			set options to "-c 32 -g 1"
		else if filename = "DEPTDT.SDA" then
			set options to "-c 29 -g 1"
		else if filename = "CLKDT.SDA" then
			set options to "-c 22 -g 1"
		else if filename = "LOGODT.SDA" then
			set options to "-c 31 -g 1"
		else if filename = "FUNCDT.SDA" then
			set options to "-c 14 -g 1"
		else
			set options to "-g 1"
		end if
		do shell script "xxd " & options & " " & pi & " > " & pi & ".txt"
	end repeat
end open

Still not having much luck with the .FDS files. Most of them only have one byte of data padded out with zeroes, for example here's the dump of TAXTB.FDS:

0000000: 00 d6 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
0000010: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
0000020: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
0000030: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
0000040: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
0000050: 00 00 00 00 00 00 00 00 00                       .........

I'm assuming TAXTB.SDA is for tax tables, which we aren't using at them moment-- nothing we sell is taxable. Here's the dump of TAXTB.SDA:

0000000: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
0000010: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
0000020: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
...more zeroes here...
0000140: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
0000150: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
0000160: 00 00 00 00 00 00 00 00                          ........

So the .FDS files that accompany the .SDA files that actually contain something are marginally more interesting. Here's the dump of CLKDT.FDS for example:

0000000: 51 00 00 19 00 00 00 01 01 01 04 01 00 10 00 00  Q...............
0000010: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
0000020: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
0000030: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
0000040: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
0000050: 00 00 00 00 00 00 00 00 00                       .........

And here's dump of the accompanying CLKDT.SDA file:

0000000: 01 00 00 00 00 42 62 75 62 6f 00 00 00 00 00 00 00 00 00 00 00 00  .....Bbubo............
0000016: 02 00 00 00 00 00 70 72 69 73 00 00 00 00 00 00 00 00 00 00 00 00  ......pris............
000002c: 03 00 00 00 00 00 74 68 6f 72 00 00 00 00 00 00 00 00 00 00 00 00  ......thor............
0000042: 04 00 00 00 00 00 43 4c 45 52 4b 30 34 00 00 00 00 00 00 00 00 00  ......CLERK04.........

Looks like Bubo gets the "42" designation because he's the manager? Or maybe he's the answer to the meaning of life?

I really have no idea where to go from here. I should probably also mention that I would love any input, if you're reading this chances are you already have some way to contact us but if not you can use our contact form. Feedback from other XE-A207 users would be cool too, would this be helpful to anyone? Does anyone know where to get a different coin tray? We only need one or two compartments-- I'm thinking of dremelling out the dividers but maybe there's a better solution.

Welp. I just tried to import a modified .SDA file back into the register and it wouldn't take it. I modified CLKDT.SDA to read

0000000: 01 00 00 00 00 42 62 75 62 6f 00 00 00 00 00 00 00 00 00 00 00 00  .....Bbubo............
0000016: 02 00 00 00 00 00 70 72 69 73 00 00 00 00 00 00 00 00 00 00 00 00  ......pris............
000002c: 03 00 00 00 00 00 74 68 6f 72 00 00 00 00 00 00 00 00 00 00 00 00  ......thor............
0000042: 04 00 00 00 00 00 5A 65 52 47 32 B8 B8 B8 00 00 00 00 00 00 00 00  ......CLERK04.........

Then re-binaried (reverted?) it using

xxd -r ~/CLKDT.SDA.txt ~/CLKDT.SDA

This is what the dump of the new CLKDT.SDA file looks like:

0000000: 01 00 00 00 00 42 62 75 62 6f 00 00 00 00 00 00 00 00 00 00 00 00  .....Bbubo............
0000016: 02 00 00 00 00 00 70 72 69 73 00 00 00 00 00 00 00 00 00 00 00 00  ......pris............
000002c: 03 00 00 00 00 00 74 68 6f 72 00 00 00 00 00 00 00 00 00 00 00 00  ......thor............
0000042: 04 00 00 00 00 00 5a 65 52 47 32 b8 b8 b8 00 00                    ......ZeRG2.....

Not sure if those last six bytes are significant or if there's a way to pad the file out using xxd. I didn't try generating a new CLKDT.FDS file mainly because I had no idea what to put in it. Incidentally \x32 is supposed to be a space and \xb8 is supposed to be a which looks pretty close to the Korean ㅋ but so much for that gag.¯\_(ツ)_/¯

Thinking about conversion 2013-10-13

So we have been sitting on this for a while, partly to generate a good amount of data and partly because we're not sure of the best way to proceed. The encoding issues have been solved with iconv, so what used to look like this

#004659  2013/10/10  9:25:07  
01 bubo                 000666
1@ 3.75                USD3.75
cafÇ latte                    
1@ 3.00                USD3.00
muffin                        
ITEMS 2Q                      
 C A S H         U S D 6 . 7 5
                              
#004660  2013/10/10  9:25:40  
01 bubo                 000666
1@ 2.50                USD2.50
drip coffee                   
ITEMS 1Q                      
 C A S H         U S D 2 . 5 0
                              
#004661  2013/10/10  9:25:47  
01 bubo                 000666
     nî sÜlà

Now looks like this

#004659  2013/10/10  9:25:07  
01 bubo                 000666
1@ 3.75                USD3.75
café latte                    
1@ 3.00                USD3.00
muffin                        
ITEMS 2Q                      
 C A S H         U S D 6 . 7 5
                              
#004660  2013/10/10  9:25:40  
01 bubo                 000666
1@ 2.50                USD2.50
drip coffee                   
ITEMS 1Q                      
 C A S H         U S D 2 . 5 0
                              
#004661  2013/10/10  9:25:47  
01 bubo                 000666
     nö sålê

Hooray! Now we just have to figure out the best way to get from there to something like

INSERT INTO transactions (trans_id, type, item, amount, stamp)
VALUES (4659, 1, 'café latte', 375, '2013-10-10 09:25:07'),
(4659, 1, 'muffin', 300, '2013-10-10 09:25:07'),
(4660, 1, 'drip coffee', 250, '2013-10-10 09:25:40'),
(4661, 2, NULL, NULL, '2013-10-10 09:25:47');

At least every transaction is separated by two line breaks, so there's that. The "01 bubo" part never changes, nor does the "000666" part, just fyi. We don't care about the "ITEMS 2Q" part or the "C A S H U S D 6 . 7 5" part.

There are some transactions that look like this

#004807  2013/10/11  9:09:51  
01 bubo                 000666
1@ 2.50                USD2.50
drip coffee                   
drip coffee             V-2.50
1@ 3.50                USD3.50
drip coffee 0.4l              
oc discount              -0.25
1@ 2.50                USD2.50
tiny pie                      
1@ 3.75                USD3.75
café latte                    
ITEMS 3Q                      
 C A S H         U S D 9 . 5 0

Where we hit the wrong button, or the customer changes their mind, or whatever. So we'll need to account for those somehow. The whole block up top can just go away, we don't need to keep those voids. The discount for bringing your own cup should be tied to the item above it, but that's not critical.

A minor DB detail 2013-10-14

So after a little consideration the INSERTs should probably look more like this

INSERT INTO transactions (trans_id, type, stamp)
VALUES (4659, 1, '2013-10-10 09:25:07'),
(4660, 1, '2013-10-10 09:25:40'),
(4661, 2, '2013-10-10 09:25:47');
INSERT INTO transaction_items (trans_id, item, amount)
VALUES (4659, 'café latte', 375),
(4659, 'muffin', 300),
(4660, 'drip coffee', 250);

iconv syntax, problematic transactions 2013-10-20

For my reference (and yours) the command to convert the native register format into something more helpful would be something along the lines of

iconv -f cp437 -t utf-8 ~/EJFILE.SDA > ~/EJFILE.SDA.utf8.txt

So I was looking at the EJ and the vast majority of transactions look more or less like this

#005599  2013/10/18  9:15:06  
01 bubo                 000666
1@ 2.50                USD2.50
americano                     
1@ 2.50                USD2.50
drip coffee                   
ITEMS 2Q                      
 C A S H         U S D 5 . 0 0

One or more items paid cash, that's it. If I can figure out how to fish out what we need from this type of transaction that would be great for 99% of our rings. I think I'd like to keep the quantity (and total amount) info for the transaction after all, so the INSERT for the above transaction would probably look something like

INSERT INTO transactions (trans_id, type, quantity, total, stamp)
VALUES (5599, 1, 2, 500, '2013-10-10 09:25:07');
INSERT INTO transaction_items (trans_id, item, amount)
VALUES (5599, 'americano', 250),
(5599, 'drip coffee', 250);

The thing is, we already have menu items in a database so it would be comparatively easy to make the item INSERTs look like this instead

INSERT INTO transaction_items (trans_id, item, amount)
VALUES (5599, 2, 250),
(5599, 21, 250);

But maybe I'm getting ahead of myself.

The only times I see running into problems are where there is an item void (eg not the entire transaction) like this

#005471  2013/10/17  9:54:25  
01 bubo                 000666
1@ 3.50                USD3.50
iced coffee                   
1@ 2.50                USD2.50
americano                     
americano               V-2.50
ITEMS 1Q                      
 C A S H         U S D 3 . 5 0

Where the person changed their mind about the americano, or we hit the button by accident, or whatever. They still wanted the iced coffee though. I also mentioned above that we give a discount for bringing your own cup, those transactions look like this

#005501  2013/10/17 11:51:13  
01 bubo                 000666
1@ 2.50                USD2.50
drip coffee                   
oc discount              -0.25
ITEMS 1Q                      
 C A S H         U S D 2 . 2 5

So maybe that should just be adjusted in the INSERT, eg

INSERT INTO transaction_items (trans_id, item, amount)
VALUES (5501, 21, 225);

But I think I'd prefer if it be separate and attach itself to the item somehow. Would it be better to make a whole new table for discounts? That seems a little silly.

Bash script, the beginnings of usefulness 2013-10-27

So I was thinking this week that even without all the item-level details the EJ data that we have so far could actually be useful. Specifically, the one line with the time stamp and transaction ID can tell us when our busiest/slowest times are, at least. So I wrote a shell script:

#!/bin/bash
stamp=`date +"%s"`
thedir='/path/to/reg_data'

find $thedir -name 'EJFILE.SDA' -type f -print0 | xargs -0 cat > $thedir/bigej.txt
iconv -f cp437 -t utf-8 $thedir/bigej.txt > $thedir/bigej_utf-8_$stamp.txt
rm $thedir/bigej.txt
grep -E "^#[0-9]{6} " <$thedir/bigej_utf-8.txt | sort | uniq | \
sed 's/.$//' | \
sed 's/^#//' | \
sed 's/ *$//' | \
sed -E 's|^([0-9]{6})[ ]+([0-9]{4})/([0-9]{2})/([0-9]{2})[ ]+(.+)$|\(\1, "\2-\3-\4 \5"\),|' | \
sed -E 's/ ([0-9]:)/ 0\1/' | \
sed 's/^(0*/(/'> $thedir/transactions_$stamp.txt

I've been pretty sloppy about organizing the archived register data, and the script reflects that. I've just been sticking the entire contents of the SD card in dated directories and leaving it up to the uniqueness of the transaction IDs to sort themselves out-- seems to be working OK so far. The register appends new EJ data on to the existing EJ until you wipe it explicitly, so there's a lot of redundant info on there. I don't really care all that much right now, maybe it'll become a problem later?

Anyway. All I had to do at this point was to stick a semicolon on the last line and add

INSERT INTO `thedb`.`xe_transactions` (`id`, `stamp`) VALUES

At the top and we were good to go. So that's pretty awesome. I'm going to try to make an SVG graph of our trends tonight but we'll see how far I get, the Fernet may interfere. I'm also not sure how much actual data I want to expose so that'll take some thinking. It's nothing personal, you understand. In any case I also made a view, it looks like this:

CREATE OR REPLACE VIEW xe_trans_dow AS
SELECT id, stamp, DATE_FORMAT(xt.stamp, '%w') AS 'dw_n', DATE_FORMAT(xt.stamp, '%W') AS 'dw_t', DATE_FORMAT(xt.stamp, '%H') AS 'hour', DATE_FORMAT(xt.stamp, '%i') AS 'min'
FROM thedb.xe_transactions as xt

At this point I'm wondering what the best way to count/collect the data from each day/hour (or half hour?) combination would be. Is it better to do that kind of thing in the DB? We're using MySQL for the record. I have the beginning of a PHP script that loads all the data into a 2D array but that seems dumb somehow.

Revised bash script 2013-12-29

So I made a few modifications to the script, mostly just to automate all the stuff I was doing by hand. Aren't computers great?

#!/bin/bash

#hot dates
stamp=`date +"%s"`
thisdir=`date +"%Y-%m-%d-%s"`

#set directories
thedir='/the/path/to/the/reg_data'
cpdir=$thedir/$thisdir

#make new directory and copy all contents of sd card
mkdir $cpdir
cp -r /Volumes/NO\ NAME/ $cpdir

#combine all ej data into one big file
#find $thedir -name 'EJFILE.SDA' -type f -print0 | xargs -0 cat > $thedir/bigej.txt

#convert to utf-8
#iconv -f cp437 -t utf-8 $thedir/bigej.txt > $thedir/bigej_utf-8_$stamp.txt

#get rid of non utf-8
#rm $thedir/bigej.txt

#or! both at once???
find $thedir -name 'EJFILE.SDA' -type f -print0 | xargs -0 cat | \
iconv -f cp437 -t utf-8 > $thedir/bigej_utf-8_$stamp.txt

#find all transactions, convince them to be sql or some semblance thereof
grep -E "^#[0-9]{6} " <$thedir/bigej_utf-8_$stamp.txt | sort | uniq | \
sed 's/.$//' | \
sed 's/^#//' | \
sed 's/ *$//' | \
sed -E 's|^([0-9]{6})[ ]+([0-9]{4})/([0-9]{2})/([0-9]{2})[ ]+(.+)$|\(\1, "\2-\3-\4 \5"\),|' | \
sed -E 's/ ([0-9]:)/ 0\1/' | \
sed 's/^(0*/(/'> $thedir/transactions_$stamp.txt

A note about PLUDT.SDA 2014-01-04

Arne writes:

Regarding your A-207 cash register project, I have a small addition: The very first byte of a line in the PLUDT.SDA file determines the PLU/EAN type of the entry. If the byte is 0x00, the PLU/EAN field contains a PLU number, if it is 0x10, the entry contains an EAN number. This is important if you connect a barcode reader.

He also mentioned that it would probably be easier/better to attack this whole project using Python and that he has had a measure of success using a close relative of our register in a grocery store. So that's pretty cool. We probably won't ever be using barcode scanners but maybe someone reading this will?

A few bash script additions 2014-04-19

So it's been a while and I don't have a whole lot to report, sorry. This is a small update to the bash script which dumps the department and PLU data to text files. Maybe at some point I'll be able to use them to update the DB or diff against older versions?

#!/bin/bash

#hot dates
stamp=`date +"%s"`
thisdir=`date +"%Y-%m-%d-%s"`

#set directories
thedir='/path/to/the/reg_data'
cpdir=$thedir/$thisdir

#make new directory and copy all contents of sd card
mkdir $cpdir
cp -r /Volumes/NO\ NAME/ $cpdir

#combine all ej data, convert to utf-8
find $thedir -name 'EJFILE.SDA' -type f -print0 | xargs -0 cat | \
iconv -f cp437 -t utf-8 > $thedir/bigej_utf-8_$stamp.txt

#find all transactions, convince them to be sql or some semblance thereof
grep -E "^#[0-9]{6} " <$thedir/bigej_utf-8_$stamp.txt | sort | uniq | \
sed 's/.$//' | \
sed 's/^#//' | \
sed 's/ *$//' | \
sed -E 's|^([0-9]{6})[ ]+([0-9]{4})/([0-9]{2})/([0-9]{2})[ ]+(.+)$|\(\1, "\2-\3-\4 \5"\),|' | \
sed -E 's/ ([0-9]:)/ 0\1/' | \
sed 's/^(0*/(/'> $thedir/transactions_$stamp.txt

#upload to archive

#clean up

#dump binaries of prog data, massage into sql? diff against old info?
find $cpdir -name 'DEPTDT.SDA' -type f -print0 | xargs -0 cat | \
xxd -c 29 -g 1 > $cpdir/DEPTDT.SDA_$stamp.txt

find $cpdir -name 'PLUDT.SDA' -type f -print0 | xargs -0 cat | \
xxd -c 32 -g 1 > $cpdir/PLUDT.SDA_$stamp.txt

#open directory in finder
open $thedir

The xxd lines generate stuff that looks like this:

0000340: 00 00 00 00 00 00 00 27 15 06 00 00 00 00 00 75 70 72 69 6e 63 65 20 6a 61 6d 6d 79 00 00 00 00  .......'.......uprince jammy....
0000360: 00 00 00 00 00 00 00 28 15 06 00 00 00 00 00 75 62 75 73 74 65 72 00 00 00 00 00 00 00 00 00 00  .......(.......ubuster..........
0000380: 00 00 00 00 00 00 00 29 15 06 00 00 00 00 01 50 64 6f 75 62 6c 65 20 62 75 73 74 65 72 00 00 00  .......).......Pdouble buster...
00003a0: 00 00 00 00 00 00 00 30 15 06 00 00 00 00 02 50 6d 6a 20 62 69 73 63 75 69 74 00 00 00 00 00 00  .......0.......Pmj biscuit......
00003c0: 00 00 00 00 00 00 00 31 15 06 00 00 00 00 02 25 70 62 20 62 61 6c 6c 65 72 00 00 00 00 00 00 00  .......1.......%pb baller.......

Which is helpful I guess?

More bash tweaks-- eliminating duplicates, sorting, etc 2014-05-15

This will probably be weird and gross for some of you, but I honestly couldn't think of any other way to do it. It occurred to me that if I replaced all the single line breaks with a token of some kind and the double line breaks with single line breaks I could get all transactions on to their own lines. At which point it would be much easier to sort and uniq them. I mentioned earlier that my archiving style was/is sloppy and as a result the aggregate EJ was getting pretty big (>60MB!) because of all the duplicates. This will hopefully allow us to keep the filesize slimmer and the file itself more orderly. And maybe actually be able to do something with it at some point.

#!/bin/bash

#hot dates
stamp=`date +"%s"`
thisdir=`date +"%Y-%m-%d-%s"`

#set directories
thedir='/the/path/to/the/reg_data'
cpdir=$thedir/$thisdir

#make new directory and copy all contents of sd card
mkdir $cpdir
cp -r /Volumes/NO\ NAME/ $cpdir

#combine all ej data, convert to utf-8
find $thedir -name 'EJFILE.SDA' -type f -print0 | xargs -0 cat | \
iconv -f cp437 -t utf-8 > $thedir/bigej_utf-8_$stamp.txt

#convert single line breaks into tokens, double line breaks into singles
tr '\r\n' '♥' < $thedir/bigej_utf-8_$stamp.txt | \
sed -E 's/♥♥ +♥♥/♣/g' | tr '♣' '\n' | sort | uniq | \
sed 's/^#//' | \
sed -E 's|^([0-9]{6})[ ]+([0-9]{4})/([0-9]{2})/([0-9]{2})[ ]+(.+)$|\1 \2-\3-\4 \5 |' | \
sed -E 's/ ([0-9]:)/ 0\1/' > \
$thedir/bigej_utf-8_lb_$stamp.txt

#find all transactions, convince them to be sql or some semblance thereof
#grep -E "^#[0-9]{6} " <$thedir/bigej_utf-8_$stamp.txt | sort | uniq | \
#sed 's/.$//' | \
#sed 's/^#//' | \
#sed 's/ *$//' | \
#sed -E 's|^([0-9]{6})[ ]+([0-9]{4})/([0-9]{2})/([0-9]{2})[ ]+(.+)$|\(\1, "\2-\3-\4 \5"\),|' | \
#sed -E 's/ ([0-9]:)/ 0\1/' | \
#sed 's/^(0*/(/'> $thedir/transactions_$stamp.txt

#upload to archive

#clean up

#dump binaries of prog data, massage into sql? diff against old info?
find $cpdir -name 'DEPTDT.SDA' -type f -print0 | xargs -0 cat | \
xxd -c 29 -g 1 > $cpdir/DEPTDT.SDA_$stamp.txt

find $cpdir -name 'PLUDT.SDA' -type f -print0 | xargs -0 cat | \
xxd -c 32 -g 1 > $cpdir/PLUDT.SDA_$stamp.txt

#find blocks that start #nnnnnn nnnn/nn/nn n?n:nn:nn
#and end  C A S H         U S D...
#look for voids eg:
#1@ 12.00              USD12.00
#hair bender                   
#hair bender            V-12.00
#remove voids
#figure out how to deal with oc discount
#eliminate duplicates

#open directory in finder
open $thedir

Offsets, for the record 2015-12-27

Just noting a few byte offsets in case anyone else would find them helpful: