Quantcast
Channel: Raspberry Pi Forums
Viewing all articles
Browse latest Browse all 5164

Off topic discussion • Re: Getting the text out of "Textra" word processor docs circa 1985

$
0
0
Going to need a bit more tinkering then.
Do you even tinkeratall?
Well, ame, I'll stand by my board name. I usually tinker a lot more than talk. Sometimes too much, literally, and I get tired and just want some advice on how to get things to work with a little less effort, eh? benefiting from the wide experience and helpful attitudes of our brothers on this forum.
I have some ancient documents produced in "Textra", an early 80s DOS word processor. …
Well, ya nerdsniped me on this one …
...

I don't recommend OCRing dot matrix output. It usually disappoints. The emulation setup should be less than a day. If possible try dosbox-x over plain dosbox, as it's much easier to navigate. It should run on almost all of your machines.

If these files aren't super sensitive, drop one my way (scruss at scruss dot com) and I'll see what I can do.

Update: I can't get Textra 6 to convert files at all. And it's clear that different version of Textra use completely different formats. Textra 3 can convert to WordStar, which WordTsar (no really!) did a valiant job on. Alternatively, Textra 6 can use a PostScript printer driver that prints to file, and that will convert to PDF nicely.
I appreciate your recommendation of dosbox-x.
I thought I had the images of the diskettes, but I don't any more. Pretty sure it wasn't v6, v3 sounds about right. This is what I see in one file:

Code:

00000000  ae 26 56 30 30 30 36 30  30 30 36 30 30 30 32 34  |.&V0006000600024|00000010  30 30 36 33 30 30 36 30  30 30 42 34 30 30 31 30  |0063006000B40010|00000020  30 30 30 30 42 30 30 31  35 31 41 31 46 31 34 32  |0000B00151A1F142|00000030  39 32 45 32 33 33 38 33  44 33 32 34 37 34 43 34  |92E23383D32474C4|00000040  31 35 36 35 42 35 0d 0a  30 36 35 36 41 36 30 30  |1565B5..0656A600|00000050  43 30 30 30 35 34 30 30  31 30 30 30 30 30 42 30  |C0005400100000B0|00000060  30 31 35 31 41 31 46 31  34 32 39 32 45 32 33 33  |0151A1F14292E233|00000070  38 33 44 33 32 34 37 34  43 34 31 35 36 35 42 35  |83D32474C41565B5|00000080  30 36 35 36 41 36 30 30  0d 0a 46 31 30 30 35 34  |0656A600..F10054|00000090  30 30 31 30 30 30 30 30  42 30 30 31 35 31 41 31  |00100000B00151A1|000000a0  46 31 34 32 39 32 45 32  33 33 38 33 44 33 32 34  |F14292E23383D324|000000b0  37 34 43 34 31 35 36 35  42 35 30 36 35 36 41 36  |74C41565B50656A6|000000c0  30 30 36 30 30 30 42 34  30 30 0d 0a 31 30 30 30  |006000B400..1000|000000d0  30 30 42 30 30 31 35 31  41 31 46 31 34 32 39 32  |00B00151A1F14292|000000e0  45 32 33 33 38 33 44 33  32 34 37 34 43 34 31 35  |E23383D32474C415|000000f0  36 35 42 35 30 36 35 36  41 36 30 30 36 30 30 30  |65B50656A6006000|00000100  42 34 30 30 31 30 30 30  30 30 42 30 0d 0a 30 31  |B400100000B0..01|00000110  35 31 41 31 46 31 34 32  39 32 45 32 33 33 38 33  |51A1F14292E23383|00000120  44 33 32 34 37 34 43 34  31 35 36 35 42 35 30 36  |D32474C41565B506|00000130  35 36 41 36 30 30 36 30  30 30 42 34 30 30 31 30  |56A6006000B40010|00000140  30 30 30 30 42 30 30 31  35 31 41 31 46 31 0d 0a  |0000B00151A1F1..|00000150  34 32 39 32 45 32 33 33  38 33 44 33 32 34 37 34  |4292E23383D32474|00000160  43 34 31 35 36 35 42 35  30 36 35 36 41 36 30 30  |C41565B50656A600|00000170  36 30 30 30 42 34 30 30  31 30 30 30 30 30 42 30  |6000B400100000B0|00000180  30 31 35 31 41 31 46 31  34 32 39 32 45 32 33 33  |0151A1F14292E233|00000190  0d 0a 38 33 44 33 32 34  37 34 43 34 31 35 36 35  |..83D32474C41565|000001a0  42 35 30 36 35 36 41 36  30 30 36 30 30 30 42 34  |B50656A6006000B4|000001b0  30 30 31 30 30 30 30 30  42 30 30 31 35 31 41 31  |00100000B00151A1|000001c0  46 31 34 32 39 32 45 32  33 33 38 33 44 33 32 34  |F14292E23383D324|000001d0  37 34 0d 0a 43 34 31 35  36 35 42 35 30 36 35 36  |74..C41565B50656|000001e0  41 36 30 30 31 30 30 30  31 30 30 30 32 30 30 30  |A600100010002000|000001f0  33 30 30 30 33 30 30 30  31 30 30 30 32 30 30 30  |3000300010002000|00000200
If this isn't textra, I'll have to dope slap myself to save you guys the trouble.
If you cab get your Textra files onto a Pi what happens if you run the "strings" command on them?

Code:

$ strings some_textra_file
If that displays anything that looks like text the you can output it to a file:

Code:

$ strings some_textra_file.  some_textra_file.txt
Which you can now open in your favourite text editor and massage it into shape.

That last part is a bit labour intensive but depending on how many files you have it may be quicker than messing with VM''s, DOSBox, reverse engineering the format etc.
Yes, strings would get me the text, but there would be a lot of checking and massaging after that, perhaps worth it for a small document, or a small number of documents, but I'm at looking for something with at least the potential of being more automatic.

Statistics: Posted by tinker2much — Mon Jan 08, 2024 2:34 am



Viewing all articles
Browse latest Browse all 5164

Trending Articles