From tuhs at tuhs.org Sat Jun 3 09:04:16 2023 From: tuhs at tuhs.org (segaloco via TUHS) Date: Fri, 02 Jun 2023 23:04:16 +0000 Subject: [TUHS] CB-UNIX dsw(1l) Page from PDP-7? Message-ID: While performing my CB-UNIX 2.3 manual separation, among the many curious things I came across was this manual page: https://www.tuhs.org/Archive/Distributions/USDL/CB_Unix/man/man1/dsw.1l.pdf The dsw(I) pages I've seen in the various UNIX manuals are all for the interactive delete utility, but make brief mention of the history of the command being amusing. I've seen some communication on the matter of the years here, but had never come across a manual page for the former version of dsw. In the linked page up there is the actual "delete from switches" version of dsw. What I find particularly interesting is that the footer indicates this was printed 8/11/81, but likewise indicates the command is "PDP-7 local". This raises a couple of questions: - Did Columbus ever touch PDP-7 UNIX? - Did dsw(I) as "delete from switches" ever make it to PDP-11 UNIX? Even the V1 manual lists the "delete interactively" utility, not this. - If neither are true, that begs the question of where this page came from, if there was ever a formalized PDP-7 manual that it would've descended from or not, etc. Finally, this page plainly spells out the history of the command in the bugs section: "This command was written in 2 minutes to delete a particular file that managed to get an 0200 bit in its name. It should work by printing the name of each file in a specified directory and requestion a 'y' or 'n' answer. Better, it should be an option of rm(1). The name is mnemonic, but likely to cause trouble in the future." So the first bug is eventually mitigated by transforming this into the more familiar dsw. I can't say what the latter means, whether it's a concern of "dsw" colliding with some reserved word eventually or is more poking fun at the other folk etymology of "delete s__t work". In any case, I hadn't seen the etymology explained to this degree in the mailing list references I found while searching around, so figured I'd share this analysis. - Matt G. P.S. There is mention here that Dennis Ritchie shared the original dsw manpage at some point https://www.tuhs.org/pipermail/tuhs/1999-November/001203.html however the link in question appears to be dead. In any case, the source for the PDP-7 version is in that email if anyone wants to look at it, although looks to be the same as what is in the archive. From marc.donner at gmail.com Sat Jun 3 22:59:35 2023 From: marc.donner at gmail.com (Marc Donner) Date: Sat, 3 Jun 2023 08:59:35 -0400 Subject: [TUHS] CB-UNIX dsw(1l) Page from PDP-7? In-Reply-To: References: Message-ID: Wow. I’m impressed … that pdf is clearly of an nth generation photocopy. What contrast ratio? More seriously, this is a delightful proof point that some cruft is really cruft. Your document archaeology work is entertaining and instructive. Thank you! Best, Marc ===== On Fri, Jun 2, 2023 at 7:04 PM segaloco via TUHS wrote: > While performing my CB-UNIX 2.3 manual separation, among the many curious > things I came across was this manual page: > https://www.tuhs.org/Archive/Distributions/USDL/CB_Unix/man/man1/dsw.1l.pdf > > The dsw(I) pages I've seen in the various UNIX manuals are all for the > interactive delete utility, but make brief mention of the history of the > command being amusing. I've seen some communication on the matter of the > years here, but had never come across a manual page for the former version > of dsw. > > In the linked page up there is the actual "delete from switches" version > of dsw. What I find particularly interesting is that the footer indicates > this was printed 8/11/81, but likewise indicates the command is "PDP-7 > local". > > This raises a couple of questions: > > - Did Columbus ever touch PDP-7 UNIX? > - Did dsw(I) as "delete from switches" ever make it to PDP-11 UNIX? Even > the V1 manual lists the "delete interactively" utility, not this. > - If neither are true, that begs the question of where this page came > from, if there was ever a formalized PDP-7 manual that it would've > descended from or not, etc. > > Finally, this page plainly spells out the history of the command in the > bugs section: > > "This command was written in 2 minutes to delete a particular file that > managed to get an 0200 bit in its name. It should work by printing the > name of each file in a specified directory and requestion a 'y' or 'n' > answer. Better, it should be an option of rm(1). The name is mnemonic, > but likely to cause trouble in the future." > > So the first bug is eventually mitigated by transforming this into the > more familiar dsw. I can't say what the latter means, whether it's a > concern of "dsw" colliding with some reserved word eventually or is more > poking fun at the other folk etymology of "delete s__t work". > > In any case, I hadn't seen the etymology explained to this degree in the > mailing list references I found while searching around, so figured I'd > share this analysis. > > - Matt G. > > P.S. There is mention here that Dennis Ritchie shared the original dsw > manpage at some point > https://www.tuhs.org/pipermail/tuhs/1999-November/001203.html however the > link in question appears to be dead. In any case, the source for the PDP-7 > version is in that email if anyone wants to look at it, although looks to > be the same as what is in the archive. > -- ===== nygeek.net mindthegapdialogs.com/home -------------- next part -------------- An HTML attachment was scrubbed... URL: From ron at ronnatalie.com Sun Jun 4 01:19:29 2023 From: ron at ronnatalie.com (Ron Natalie) Date: Sat, 03 Jun 2023 15:19:29 +0000 Subject: [TUHS] CB-UNIX dsw(1l) Page from PDP-7? In-Reply-To: References: Message-ID: Might have been a ditto or mimeograh at some point. We had such section 1 manuals at JHU when I was a student there in 1977. ------ Original Message ------ >From "Marc Donner" To "segaloco" Cc "The Eunuchs Hysterical Society" Date 6/3/23, 8:59:35 AM Subject [TUHS] Re: CB-UNIX dsw(1l) Page from PDP-7? >Wow. I’m impressed … that pdf is clearly of an nth generation >photocopy. What contrast ratio? > >More seriously, this is a delightful proof point that some cruft is >really cruft. > >Your document archaeology work is entertaining and instructive. Thank >you! > >Best, > >Marc >===== >On Fri, Jun 2, 2023 at 7:04 PM segaloco via TUHS wrote: >>While performing my CB-UNIX 2.3 manual separation, among the many >>curious things I came across was this manual page: >>https://www.tuhs.org/Archive/Distributions/USDL/CB_Unix/man/man1/dsw.1l.pdf >> >>The dsw(I) pages I've seen in the various UNIX manuals are all for the >>interactive delete utility, but make brief mention of the history of >>the command being amusing. I've seen some communication on the matter >>of the years here, but had never come across a manual page for the >>former version of dsw. >> >>In the linked page up there is the actual "delete from switches" >>version of dsw. What I find particularly interesting is that the >>footer indicates this was printed 8/11/81, but likewise indicates the >>command is "PDP-7 local". >> >>This raises a couple of questions: >> >>- Did Columbus ever touch PDP-7 UNIX? >>- Did dsw(I) as "delete from switches" ever make it to PDP-11 UNIX? >>Even the V1 manual lists the "delete interactively" utility, not this. >>- If neither are true, that begs the question of where this page came >>from, if there was ever a formalized PDP-7 manual that it would've >>descended from or not, etc. >> >>Finally, this page plainly spells out the history of the command in >>the bugs section: >> >>"This command was written in 2 minutes to delete a particular file >>that managed to get an 0200 bit in its name. It should work by >>printing the name of each file in a specified directory and requestion >>a 'y' or 'n' answer. Better, it should be an option of rm(1). The >>name is mnemonic, but likely to cause trouble in the future." >> >>So the first bug is eventually mitigated by transforming this into the >>more familiar dsw. I can't say what the latter means, whether it's a >>concern of "dsw" colliding with some reserved word eventually or is >>more poking fun at the other folk etymology of "delete s__t work". >> >>In any case, I hadn't seen the etymology explained to this degree in >>the mailing list references I found while searching around, so figured >>I'd share this analysis. >> >>- Matt G. >> >>P.S. There is mention here that Dennis Ritchie shared the original dsw >>manpage at some point >>https://www.tuhs.org/pipermail/tuhs/1999-November/001203.html however >>the link in question appears to be dead. In any case, the source for >>the PDP-7 version is in that email if anyone wants to look at it, >>although looks to be the same as what is in the archive. >-- >===== >nygeek.net >mindthegapdialogs.com/home -------------- next part -------------- An HTML attachment was scrubbed... URL: From random832 at fastmail.com Tue Jun 6 04:34:43 2023 From: random832 at fastmail.com (Random832) Date: Mon, 05 Jun 2023 14:34:43 -0400 Subject: [TUHS] CB-UNIX dsw(1l) Page from PDP-7? In-Reply-To: References: Message-ID: <7a4700f1-7465-4549-985c-83ad7dd8b00e@app.fastmail.com> On Fri, Jun 2, 2023, at 19:04, segaloco via TUHS wrote: > Finally, this page plainly spells out the history of the command in the > bugs section: > > "This command was written in 2 minutes to delete a particular file that > managed to get an 0200 bit in its name. It should work by printing the > name of each file in a specified directory and requestion a 'y' or 'n' > answer. Better, it should be an option of rm(1). The name is > mnemonic, but likely to cause trouble in the future." One thing this makes me wonder is why the solution chosen wasn't to make a command that could *rename* the offending directory entry [which, then, with problematic characters removed, could be examined before deleting normally]. From norman at oclsc.org Tue Jun 6 05:30:23 2023 From: norman at oclsc.org (Norman Wilson) Date: Mon, 5 Jun 2023 15:30:23 -0400 (EDT) Subject: [TUHS] CB-UNIX dsw(1l) Page from PDP-7? Message-ID: Rather an aside, but the alt.sysadmin.recovery message referenced in https://www.tuhs.org/pipermail/tuhs/1999-November/001203.html chimes interesting chords for me. If you care only about technical stuff, you should skip to your next e-mail message now. On one hand, the doubly-embedded net.unix-wizards message from Dennis, dated 1984-12-08, containing the original dsw.s: that was posted a few months after I joined Bell Labs, and may even have been partly my fault. 1984 was the nominal 15th anniversary of UNIX; as a member of the steering committee of the recently-formed UNIX* Special Interest Group in US DECUS, I convinced Dennis to attend the Fall 1984 Symposium, in Anaheim in early December, as part of a celebration. As another part, I made copies of the V1-V7 manuals in the UNIX Room and had them shipped out so people could leaf through the history. I think Dennis dug out the PDP-7 source-code books as a contribution to that effort; I am all but certain we brought copies of those too. Of course I had the copies returned not to the Labs but to my home address. I still have them, now on a shelf in my home office. A good friend offered to take care of shipping them back to New Jersey, of course making her own copies in return. I also recall that the conference hotel happened to give me room 127. I offered to swap with Dennis, since he deserved that number more than I did (the extra digit had already been prepended before I joined) but he cheerfully declined. On the other hand--not as historic except to me--the author of the singly-embedded 1999-11-23 alt.sysadmin.recovery message is now (and has for some years been) a co-worker and a good friend. So this single 1999 TUHS posting touches points near both the beginning and the end (so far) of my career, and two different groups of smart people who are fun to work with. Norman Wilson Toronto ON From tuhs at tuhs.org Tue Jun 6 13:04:53 2023 From: tuhs at tuhs.org (segaloco via TUHS) Date: Tue, 06 Jun 2023 03:04:53 +0000 Subject: [TUHS] Some AT&T Promotional Material/Pricing from Early 80s Message-ID: <71XxhojD5-EjPpdoE4lnOSoYZ1kQmaS5_X4zJWjJoByoTmHvQk3kitJU1uRCTp-AnFzNSR1l-2JvuKfPxANcmBgUf_KAH2YQacVWicclLq4=@protonmail.com> Hello, I've just today received another box from the person I got that set of UNIX manual binders from and hoo boy there's some cool stuff in here. It'll probably be a bit before I scan it all, but among the many bits is a folder bearing "Software by the Bell System" on the front cover with a photo of some tape reels lying around. The back is a simple black 70's Bell system logo. Flipping to the interior, the left panel of the folder bears facsimile AT&T letterhead with a "letter" from Otis L. Wilson, Technology Licensing Manager, denoting what promotional materials are enclosed. Among the various terms of the licenses mentioned is: 'all software comes "as is" -- with no maintenance agreements or technical support' Between this and the Bell logos all over this stuff, I presume it is prior to 1982. As for the contents themselves, there are pages for V6, V7, Mini-UNIX, PWB, 32V, and System III, the last of which is a photocopy whereas all the others are on some nice glossy cardstock, so I presume this was hot out the door on the heels of System III as a commercial release. Aside from pages describing each of these UNIX versions, there is a separate page describing "The Phototypesetter Package", in other words, pre-DWB distribution of TROFF and friends. Aside from the UNIX stuff, there are also various utilities amongst IBM 360/370 and Honeywell 600/6000 systems and some various scientific and mathematical systems. Also included is "Summary of UNIX System III" which looks to be a bit of an amalgamation of info from some of the "Documents for UNIX 3.0" set distributed with System III. Unfortunately, being for external release, the document is very of the mindset of "here's what changes from V7/32V to System III" rather than that sweet sweet "here's what changes from PWB 2.0 to 3.0" that I hope to find (or create) sometime. Anywho, finally amongst the promo material was an (undated) letter from M.B. Wicker (Technology Licensing, AT&T) to an unlisted recipient, obviously just copy they sent to everyone, essentially communicating the terms of UNIX System III in more detail. Between all of these materials, the following are UNIX-related prices I could find: UNIX Sixth Edition - Initial CPU - $20,000 - Additional CPUs - $6,700 - UNIX Programmer's Manual - $30 - Documents for Use with UNIX - $30 - System III upgrade - $26,000 - System III add CPU - $10,300 PWB/UNIX - Initial CPU - $30,000 - Additional CPUs - $10,000 - PWB/UNIX User's Manual - $40 - Documents for PWB/UNIX - $40 - System III upgrade - $16,000 - System III add CPU - $7,000 Mini-UNIX - Initial CPU - $12,000 - Additional CPUs - $4,000 - UNIX Programmer's Manual - $30 - Doucments for Use with Mini-UNIX - $30 UNIX/V7 - Initial CPU - $28,000 - Additional CPUs - $9,400 - UNIX Programmer's Manual, Vol. 1 - $40 - UNIX Programmer's Manual, Vols. 2A and 2B - $60 - System III upgrade - $18,000 - System III add CPU - $7,600 UNIX/32V - Initial CPU - $40,000 - Additional CPUs - $15,000 - UNIX/32V Programmer's Manual - $40 - UNIX/32V Programmer's Manual, Vols. 2A and 2B - $60 - System III upgrade - $6,000 - System III add CPU - $2,000 UNIX System III - Initial CPU - $43,000 - Additional CPUs - $16,000 - UNIX User's Manual - $40 - Programmer's Manual for UNIX System III, Volume 2A and 2B - $40 each - The separate page detailing System III further goes on to break down that a non-refundable payment of $25,000 to sublicense object code Phototypesetter (Version 7) - Initial CPU - $3,300 - Additional CPUs - $1,100 - Documents for Use with Phototypesetter - Version Seven - $20 Additionally there are options to upgrade a V6&V7 supporting license to System III for $14,000 and add additional CPUs to those terms for $6,300. The same for a group of V7 and PWB for $4,000 and $3,000 for first CPU and addtional CPUs respectively. Of note is that all documents listed above could be purchased from the Computing Information Service in Murray Hill *except* those issued for UNIX System III, which instead were to be ordered from the Western Electric Patent Licensing Organization. This reflects the shift to WECo distribution from Bell Labs themselves, as would continue to be the case for 3B20S shipments of 4.1 and the eventual 5.0 and System V releases. In addition to the promotional materials are also "Specimen Copy" blanks of the various licenses involved in at least System III, perhaps other versions (there are blanks where LICENSED SOFTWARE is supposed to be written/typed in). Finally, in the same folder is also a nice stack of UNIX summary documents spanning different versions. There are summaries for PWB, Mini-UNIX, V7, and 32V. Additionally, there is a document "Proposal to Provide VAX UNIX system support at Berkeley" by Bob Fabry. A quick internet search didn't turn up a PDF of this, so I have to wonder if it's preserved somewhere. If not, it will be. The other document here may prove even more interesting though: "The UNIX Time-Sharing System for UNIVAC 1100 Series Systems - 7th Edition Summary", dated October 19th, 1981. Can't say I've seen this anywhere, just mention of the UNIVAC version in the BSTJ article on UNIX porting experiences. A quick perusal yields a document very similar to the V7 and 32V summaries, but with UNIVAC-isms pointed out. Anywho, there's more material in this box than just this stuff, but this floated to the top as particularly significant. Among other contents are a "UNIX System and 'C' Language" "Training from the Source" foldout from WECo's Corporate Education group, listing 16 courses available at various training centers. There is a copy (nicely produced) of the 1984 draft /usr/group standard along with a stapled, standard printer document titled "Reviewer's Guide to the PROPOSED /usr/group Standard" dated March 14, 1984. I must say, the publication quality for this being a proposed standard is quite nice, I'd expect a draft to be lucky to have staples if not being just paper-clipped together, but it has a nice printed cover with a logo and all. There is one other thing but I'm making a separate thread for that one, might warrant quite different feedback than this stuff. - Matt G. From tuhs at tuhs.org Tue Jun 6 13:16:17 2023 From: tuhs at tuhs.org (segaloco via TUHS) Date: Tue, 06 Jun 2023 03:16:17 +0000 Subject: [TUHS] DOD KSOS Secure UNIX Operating System Manual and Final Report Message-ID: As promised in the other email, I had one other tidbit worth sharing some detail on but that is very different from WECo promo and informational material. What I've got here are two documents pertaining to the "Department of Defence Kernelized Secure Operating System" project as undertaken by Ford's Western Development Laboratories Division. The documents in question are https://apps.dtic.mil/sti/pdfs/ADA111577.pdf and https://apps.dtic.mil/sti/pdfs/ADA111566.pdf and represent the User's Manuals and Final Report respectively on this KSOS system. These appear to be from the same microfiche as the documents linked based on splotches on the last page's date frame, although the copies I have here have the full frame, the PDFs linked seem to have the last panel cropped to a small square in the middle. Not super significant, but sometimes it's the little details. Anywho, unfortunately I don't have much to report, I got a bit excited while looking for these at first because I was having a hard time turning up PDFs, thought I had stumbled upon something unseen for some time, but in the gulf between last email and this I found them. Silver lining is one less set of documents to scan, but nothing to really expose that isn't already a click away. - Matt G. From tom.perrine+tuhs at gmail.com Tue Jun 6 13:34:09 2023 From: tom.perrine+tuhs at gmail.com (Tom Perrine) Date: Mon, 5 Jun 2023 20:34:09 -0700 Subject: [TUHS] DOD KSOS Secure UNIX Operating System Manual and Final Report In-Reply-To: References: Message-ID: I will have to take a look around my offline archives and CDs. I had the complete source of KSOS as it existed ca 1988 after certain changes were made at Logicon. I haven't looked at that data in years - it is possible that it may even have the original Ford KDNs (Kernel Design Notes). I'm *pretty* sure I know where it is. If there's any interest I can look for it later this week. Before and during early covid I was using SIMH to build a PWB system to serve as the dev platform for KSOS. I think I had everything EXCEPT the Modula compiler. For the life of me I can't remember which compiler it was. It was GFE as part of the project; it would have been the same compiler used at Ford. --tep On Mon, Jun 5, 2023 at 8:16 PM segaloco via TUHS wrote: > As promised in the other email, I had one other tidbit worth sharing some > detail on but that is very different from WECo promo and informational > material. What I've got here are two documents pertaining to the > "Department of Defence Kernelized Secure Operating System" project as > undertaken by Ford's Western Development Laboratories Division. > > The documents in question are https://apps.dtic.mil/sti/pdfs/ADA111577.pdf > and https://apps.dtic.mil/sti/pdfs/ADA111566.pdf and represent the User's > Manuals and Final Report respectively on this KSOS system. These appear to > be from the same microfiche as the documents linked based on splotches on > the last page's date frame, although the copies I have here have the full > frame, the PDFs linked seem to have the last panel cropped to a small > square in the middle. Not super significant, but sometimes it's the little > details. > > Anywho, unfortunately I don't have much to report, I got a bit excited > while looking for these at first because I was having a hard time turning > up PDFs, thought I had stumbled upon something unseen for some time, but in > the gulf between last email and this I found them. Silver lining is one > less set of documents to scan, but nothing to really expose that isn't > already a click away. > > - Matt G. > -------------- next part -------------- An HTML attachment was scrubbed... URL: From tuhs at tuhs.org Tue Jun 6 13:46:20 2023 From: tuhs at tuhs.org (segaloco via TUHS) Date: Tue, 06 Jun 2023 03:46:20 +0000 Subject: [TUHS] DOD KSOS Secure UNIX Operating System Manual and Final Report In-Reply-To: References: Message-ID: If you do find that information Tom, one thing I'd be curious about is if it contains the seismic discrimination components: https://apps.dtic.mil/sti/pdfs/ADA121241.pdf This same cache of documents contained two sets of the manual pages for what I presume are the components being discussed in that paper. The libraries are listed as "seismic", "datacopy", "sdp", "graphics", and "mod1". That said, I can't authoritatively tie them together but I think that's all talking about the same thing. Unfortunately none of this stuff was in close proximity in the binders and stacks of paper, so I can't say whether they're actually related or not, but between this all being together and both KSOS and the seismic package info being there in the same .mil repository, there's a chance. No rush though, just something to keep in mind if you happen upon that info. - Matt G. ------- Original Message ------- On Monday, June 5th, 2023 at 8:34 PM, Tom Perrine wrote: > I will have to take a look around my offline archives and CDs. I had the complete source of KSOS as it existed ca 1988 after certain changes were made at Logicon. > > I haven't looked at that data in years - it is possible that it may even have the original Ford KDNs (Kernel Design Notes). I'm *pretty* sure I know where it is. > > If there's any interest I can look for it later this week. > > Before and during early covid I was using SIMH to build a PWB system to serve as the dev platform for KSOS. I think I had everything EXCEPT the Modula compiler. For the life of me I can't remember which compiler it was. It was GFE as part of the project; it would have been the same compiler used at Ford. > > --tep > > On Mon, Jun 5, 2023 at 8:16 PM segaloco via TUHS wrote: > >> As promised in the other email, I had one other tidbit worth sharing some detail on but that is very different from WECo promo and informational material. What I've got here are two documents pertaining to the "Department of Defence Kernelized Secure Operating System" project as undertaken by Ford's Western Development Laboratories Division. >> >> The documents in question are https://apps.dtic.mil/sti/pdfs/ADA111577.pdf and https://apps.dtic.mil/sti/pdfs/ADA111566.pdf and represent the User's Manuals and Final Report respectively on this KSOS system. These appear to be from the same microfiche as the documents linked based on splotches on the last page's date frame, although the copies I have here have the full frame, the PDFs linked seem to have the last panel cropped to a small square in the middle. Not super significant, but sometimes it's the little details. >> >> Anywho, unfortunately I don't have much to report, I got a bit excited while looking for these at first because I was having a hard time turning up PDFs, thought I had stumbled upon something unseen for some time, but in the gulf between last email and this I found them. Silver lining is one less set of documents to scan, but nothing to really expose that isn't already a click away. >> >> - Matt G. -------------- next part -------------- An HTML attachment was scrubbed... URL: From jsg at jsg.id.au Tue Jun 6 15:26:22 2023 From: jsg at jsg.id.au (Jonathan Gray) Date: Tue, 6 Jun 2023 15:26:22 +1000 Subject: [TUHS] Some AT&T Promotional Material/Pricing from Early 80s In-Reply-To: <71XxhojD5-EjPpdoE4lnOSoYZ1kQmaS5_X4zJWjJoByoTmHvQk3kitJU1uRCTp-AnFzNSR1l-2JvuKfPxANcmBgUf_KAH2YQacVWicclLq4=@protonmail.com> References: <71XxhojD5-EjPpdoE4lnOSoYZ1kQmaS5_X4zJWjJoByoTmHvQk3kitJU1uRCTp-AnFzNSR1l-2JvuKfPxANcmBgUf_KAH2YQacVWicclLq4=@protonmail.com> Message-ID: On Tue, Jun 06, 2023 at 03:04:53AM +0000, segaloco via TUHS wrote: > Finally, in the same folder is also a nice stack of UNIX summary > documents spanning different versions. There are summaries for > PWB, Mini-UNIX, V7, and 32V. Additionally, there is a document > "Proposal to Provide VAX UNIX system support at Berkeley" by Bob > Fabry. A quick internet search didn't turn up a PDF of this, so I > have to wonder if it's preserved somewhere. If not, it will be. Is that the proposal mentioned in https://www.oreilly.com/openbook/opensources/book/kirkmck.html "In the fall of 1979, Bob Fabry responded to DARPA's interest in moving towards Unix by writing a proposal suggesting that Berkeley develop an enhanced version of 3BSD for the use of the DARPA community." predating "The New ARPA VAX UNIX Project" https://archive.org/details/login_january-1981/page/12/mode/2up CSRG TR/4: Proposals for enhancement of UNIX on the VAX https://archive.org/details/csrgtr4/mode/2up From tuhs at tuhs.org Wed Jun 7 17:17:14 2023 From: tuhs at tuhs.org (segaloco via TUHS) Date: Wed, 07 Jun 2023 07:17:14 +0000 Subject: [TUHS] Pixel 100/AP UNIX Computer Message-ID: After talking with the folks I bought the recent documents from, they let me know they are also selling a piece of hardware: https://www.ebay.com/itm/125714380860 After the link is an auction for an Instrumentation Laboratory Pixel 100/AP. A small booklet included with the many documents I received indicates as of 1982 the Pixel 100/AP ran a System III derivative. The booklet goes on to present a summary of user commands and options. Despite the System III basis, included among these are the C shell and ex/vi. I have no room for hardware or honestly at that price point it'd be worth the preservation effort. Hopefully it finds a good home, it includes an almost complete documentation set save for the small booklet I've got (which could be separate promo material for all I know) In any case, there were a few letters amongst the documents suggesting the original owner was involved in the production of this system, particularly in the area of OS details. If I find any noteworthy information I'll pass it along. - Matt G. P.S. If anyone knows of a preservation effort accepting new machines I can pass this along. From f4grx at f4grx.net Wed Jun 7 20:14:14 2023 From: f4grx at f4grx.net (Sebastien F4GRX) Date: Wed, 7 Jun 2023 12:14:14 +0200 Subject: [TUHS] Software written in B Message-ID: <1e651370-3ada-e211-c277-409d6563500d@f4grx.net> Hello everyone, this is my first post on this list. After looking at the archives for this mailing list, I have seen that the B language has been discussed several times already. After viewing Ken Thompson's interview by Brian Kernighan at VCF East 2019, I became interested in the B language, as it seemed full-featured for system programming, close to C, and simple enough to write a parser for it without a code generation tool. So for fun and self-education, I am now writing a (or yet another) B compiler, in C, after reading Jack Crenshaw's "Let's build a compiler" documentation ( https://compilers.iecc.com/crenshaw/ ) Here it is: https://git.sr.ht/~f4grx/bpars It is now starting to generate code for the 68hc11 8-bit platform. It can also generate C code. I have written some test programs, found some B examples, but I thought it would be great to use my compiler with actual B software. Of course, B was a "transition" language, that did not have a continued use as soon as it evolved into C. so if any software remains, it will be quite hard to find. And here is my question, is any of you aware of original B source code archives? or are in touch with people that would know? In particular, I read on this document written by Dennis Ritchie: https://www.bell-labs.com/usr/dmr/www/chist.html > After the TMG version of B was working, Thompson rewrote B in itself (a bootstrapping step). I have also read that the YACC tool was initially written in B. There might be other historical B sources that I am not aware of. Do you know if any of this code has survived to this day? Where could I find more information about this? Thank you very much, Sebastien Lorquet (F4GRX) From lars at nocrew.org Wed Jun 7 20:38:12 2023 From: lars at nocrew.org (Lars Brinkhoff) Date: Wed, 07 Jun 2023 10:38:12 +0000 Subject: [TUHS] Software written in B In-Reply-To: <1e651370-3ada-e211-c277-409d6563500d@f4grx.net> (Sebastien F4GRX's message of "Wed, 7 Jun 2023 12:14:14 +0200") References: <1e651370-3ada-e211-c277-409d6563500d@f4grx.net> Message-ID: <7wy1kv5zhn.fsf@junk.nocrew.org> Sebastien F4GRX wrote: > And here is my question, is any of you aware of original B source code > archives? or are in touch with people that would know? Not for Unix, but. Steven Johnson went on a sabbatical from Bell Labs to Waterloo University and took B with him. A group of people there formed a company which still offers B for GCOS. https://www.thinkage.ca/gcos/expl/b/manu/manu.html The original AberMUD was written by Alan Cox (of Linux fame) et al in B. A paper listing exists but has yet to be scanned. I collected some scraps here: https://github.com/larsbrinkhoff/abermud/tree/master/abermud1 Some other people also writing B compilers: https://github.com/aap/b https://github.com/DavidSkrundz/B From aap at papnet.eu Thu Jun 8 01:05:27 2023 From: aap at papnet.eu (Angelo Papenhoff) Date: Wed, 7 Jun 2023 17:05:27 +0200 Subject: [TUHS] Software written in B In-Reply-To: <1e651370-3ada-e211-c277-409d6563500d@f4grx.net> References: <1e651370-3ada-e211-c277-409d6563500d@f4grx.net> Message-ID: Sebastien, I'm not aware of any old UNIX B code beyond the examples from the documentation and one or two short programs for the PDP-7 (this is older B). As Lars already mentioned, some people have written their own B compilers, even in B. My compiler (https://github.com/aap/b) generates the same threaded code that ran on the PDP-11 and I implemented it on a few platforms, (pdp-11, amd64, mips32, riscv64), notably it runs on UNIX v6. Robert Swierczek has written a B compiler that is compatible with the PDP-7 runtime: https://github.com/DoctorWkt/pdp7-unix/blob/master/src/other/b.b With B you pretty much have to write your own code unfortunately. Would be great if some bigger programs (like the compiler and yacc) were found. best, Angelo On 07/06/23, Sebastien F4GRX wrote: > Hello everyone, > > this is my first post on this list. > > > After looking at the archives for this mailing list, I have seen that > the B language has been discussed several times already. > > After viewing Ken Thompson's interview by Brian Kernighan at VCF East > 2019, I became interested in the B language, as it seemed full-featured > for system programming, close to C, and simple enough to write a parser > for it without a code generation tool. > > So for fun and self-education, I am now writing a (or yet another) B > compiler, in C, after reading Jack Crenshaw's "Let's build a compiler" > documentation ( https://compilers.iecc.com/crenshaw/ ) > > Here it is: https://git.sr.ht/~f4grx/bpars > > It is now starting to generate code for the 68hc11 8-bit platform. It > can also generate C code. > > > I have written some test programs, found some B examples, but I thought > it would be great to use my compiler with actual B software. > > Of course, B was a "transition" language, that did not have a continued > use as soon as it evolved into C. so if any software remains, it will be > quite hard to find. > > And here is my question, is any of you aware of original B source code > archives? or are in touch with people that would know? > > > In particular, I read on this document written by Dennis Ritchie: > https://www.bell-labs.com/usr/dmr/www/chist.html > > > After the TMG version of B was working, Thompson rewrote B in itself > (a bootstrapping step). > > > I have also read that the YACC tool was initially written in B. > > There might be other historical B sources that I am not aware of. > > > Do you know if any of this code has survived to this day? Where could I > find more information about this? > > > Thank you very much, > > Sebastien Lorquet (F4GRX) > From clemc at ccc.com Thu Jun 8 01:57:30 2023 From: clemc at ccc.com (Clem Cole) Date: Wed, 7 Jun 2023 11:57:30 -0400 Subject: [TUHS] Software written in B In-Reply-To: <1e651370-3ada-e211-c277-409d6563500d@f4grx.net> References: <1e651370-3ada-e211-c277-409d6563500d@f4grx.net> Message-ID: I recall a couple of editors and some tools were kicking around written in B. I would check the GCOS archives, as I believe that for a long time, B was a popular systems programming language for that OS target. B might have been moved to Multics, but I have no memory of seeing it. IIRC, most system programming there was on in its powerful PL/1 dialect or a Fortran/often with a preprocessor like MORTRAN or RatFor, which I did see. Interestingly, I also have no memory of a B implementation for the PDP-10, which like GE/Honeywell systems, was 36-bit, word addressed. I used BLISS and SAIL on those, if not the assembler. FWIW: Besides C, B also begat two other languages Eh and Zed, both at Waterloo, Eh I believe, was what the original Thoth system was written, although it might have had some utilities in B; you have to ask someone like Mike Malcom. Since many/most of the 1970s mini's and later micro's, ISAs were byte addressed, the word nature of B (and the fact that the source to Ritchie C compiler came with UNIX), is probably what caused it to have a more limited life. ᐧ On Wed, Jun 7, 2023 at 6:14 AM Sebastien F4GRX wrote: > Hello everyone, > > this is my first post on this list. > > > After looking at the archives for this mailing list, I have seen that > the B language has been discussed several times already. > > After viewing Ken Thompson's interview by Brian Kernighan at VCF East > 2019, I became interested in the B language, as it seemed full-featured > for system programming, close to C, and simple enough to write a parser > for it without a code generation tool. > > So for fun and self-education, I am now writing a (or yet another) B > compiler, in C, after reading Jack Crenshaw's "Let's build a compiler" > documentation ( https://compilers.iecc.com/crenshaw/ ) > > Here it is: https://git.sr.ht/~f4grx/bpars > > It is now starting to generate code for the 68hc11 8-bit platform. It > can also generate C code. > > > I have written some test programs, found some B examples, but I thought > it would be great to use my compiler with actual B software. > > Of course, B was a "transition" language, that did not have a continued > use as soon as it evolved into C. so if any software remains, it will be > quite hard to find. > > And here is my question, is any of you aware of original B source code > archives? or are in touch with people that would know? > > > In particular, I read on this document written by Dennis Ritchie: > https://www.bell-labs.com/usr/dmr/www/chist.html > > > After the TMG version of B was working, Thompson rewrote B in itself > (a bootstrapping step). > > > I have also read that the YACC tool was initially written in B. > > There might be other historical B sources that I am not aware of. > > > Do you know if any of this code has survived to this day? Where could I > find more information about this? > > > Thank you very much, > > Sebastien Lorquet (F4GRX) > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From lars at nocrew.org Thu Jun 8 02:21:59 2023 From: lars at nocrew.org (Lars Brinkhoff) Date: Wed, 07 Jun 2023 16:21:59 +0000 Subject: [TUHS] Software written in B In-Reply-To: (Clem Cole's message of "Wed, 7 Jun 2023 11:57:30 -0400") References: <1e651370-3ada-e211-c277-409d6563500d@f4grx.net> Message-ID: <7wttvj5jko.fsf@junk.nocrew.org> Clem Cole writes: > Interestingly, I also have no memory of a B implementation for the PDP-10, > which like GE/Honeywell systems, was 36-bit, word addressed. PDP-10 had BCPL (two!) which was somewhat popular, so maybe it did't need a pared-down version. Same goes for TX-2 if you want to talk 36-bit (who doesn't). > I used BLISS and SAIL on those, if not the assembler. SAIL begat Mainsail, wich in theory could still be around today... for Unix, he said on-topicly! From bakul at iitbombay.org Thu Jun 8 03:26:10 2023 From: bakul at iitbombay.org (Bakul Shah) Date: Wed, 7 Jun 2023 10:26:10 -0700 Subject: [TUHS] Software written in B In-Reply-To: <1e651370-3ada-e211-c277-409d6563500d@f4grx.net> References: <1e651370-3ada-e211-c277-409d6563500d@f4grx.net> Message-ID: <798AFBDA-375B-4038-9587-1BA38309E8E7@iitbombay.org> Then there is Arthur Whitney's B language. This is a *completely* different beast from Ken Thompson's B. https://web.archive.org/web/20160505183405/http://kparc.com/b/ Here's binary search in it: b[Ii]{h:#x;l:0;while(h>l)$[y>x[i:/l+h];l:i+1;h:i];l} for comparison here it is in C: I b(I*x,I y){I h=x[-1],i,l=0;while(h>l)if(y>x[i=l+h>>1])l=i+1;else h=i;R l;} [B is modeled after his K language, hence the extreme terseness] Perhaps not most people's cup of tea but certainly interesting. Too bad no one else has dared to pick it up! :)/2 > On Jun 7, 2023, at 3:14 AM, Sebastien F4GRX wrote: > > So for fun and self-education, I am now writing a (or yet another) B compiler, in C, after reading Jack Crenshaw's "Let's build a compiler" documentation ( https://compilers.iecc.com/crenshaw/ ) > > Here it is: https://git.sr.ht/~f4grx/bpars From phil at ultimate.com Thu Jun 8 04:16:55 2023 From: phil at ultimate.com (Phil Budne) Date: Wed, 07 Jun 2023 14:16:55 -0400 Subject: [TUHS] Software written in B In-Reply-To: <1e651370-3ada-e211-c277-409d6563500d@f4grx.net> References: <1e651370-3ada-e211-c277-409d6563500d@f4grx.net> Message-ID: <202306071816.357IGtA0046002@ultimate.com> Here's what I know about surviving early UNIX B implementations bits: Two surviving B programs from PDP-7 UNIX listings: https://github.com/DoctorWkt/pdp7-unix/blob/master/scans/ind.b https://github.com/DoctorWkt/pdp7-unix/blob/master/scans/lcase.b NOT a b program, but for "fun" I tried my hand at recreating the TMG compiler compiler for the PDP-7 (we only have the original library routines, but not the TMG (.t) source for the compiler at: https://github.com/philbudne/pdp7-unix/tree/tmg/src/other/pbtmg and tried writing a B compiler in TMG (b.t in the above directory) based on Robert Swierczek's decoding of the B interpreter/runtime. Both the PDP-7 and (initial?) PDP-11 B compilers generated interpreted code. I'm not aware if the PDP-11 B compiler source has ever been unearthed, but a ar(chive) file of the library and interpreted was found, and Robert S posted disassemblies here: https://github.com/rswier/pdp11-B/ With this explanation of where the files were discovered: https://github.com/rswier/pdp11-B/tree/master/fs As well as http://squoze.net/B/bilib/ Which I added to Robert's tree in a private branch that I guess I never opened a Pull Request for: https://github.com/philbudne/pdp11-B/tree/pb/source/bilib And I (think) I had to make up brt1.s and brt2.s files: https://github.com/philbudne/pdp11-B/blob/pb/source/brt/brt1.s https://github.com/philbudne/pdp11-B/blob/pb/source/brt/brt2.s based on the usage description in Ken's manual: https://www.bell-labs.com/usr/dmr/www/kbman.html : ld object /etc/brt1 -lb /etc/bilib /etc/brt2 I tried to hack Robert's PDP-7 B compiler to work with the above PDP-11 runtime: https://github.com/philbudne/pdp11-B/blob/pb/source/b711/ But I don't remember what state I left it in. Ken's manual indicates that the original PDP-11 B compiler had two phases: "bc" which generated intermediate code, and "ba" which turned that into a .s file. A "fun fact" about all the above compilers (TMG and B) for both the PDP-7 and the PDP-11 is that the compilers (eventually) output assembly language source that was assembled (and for the PDP-11 loaded) with an interpreter library and runtime. I seem to recall that for the PDP-7, the interpreted code (for both TMG and B) looks much like PDP-7 instructions (same number of high order "opcode" bits, some use of the indirect bit, and an address field while the PDP-11 B compiler (at least) generated code is more like threaded code, with an interpreter program counter in r3, and "jmp *(r3)+" to dispatch to the next instruction. From andrew at humeweb.com Thu Jun 8 09:49:49 2023 From: andrew at humeweb.com (Andrew Hume) Date: Wed, 7 Jun 2023 16:49:49 -0700 Subject: [TUHS] Software written in B In-Reply-To: <1e651370-3ada-e211-c277-409d6563500d@f4grx.net> References: <1e651370-3ada-e211-c277-409d6563500d@f4grx.net> Message-ID: i was in 1127 in 1981 (for an internship) and then from 1983 onwards. the only B code i saw or knew of was the driver code that ran in the Linotron typesetter. it ran on an 8in floppy and my recollection is that the thing that mattered most was its very small run-time support. i’m pretty sure it was Ken’s code but i never knew any other details. From tuhs at tuhs.org Thu Jun 8 12:10:08 2023 From: tuhs at tuhs.org (segaloco via TUHS) Date: Thu, 08 Jun 2023 02:10:08 +0000 Subject: [TUHS] Software written in B In-Reply-To: References: <1e651370-3ada-e211-c277-409d6563500d@f4grx.net> Message-ID: > Of course, B was a "transition" language, that did not have a continued use as soon as it evolved into C. so if any software remains, it will be quite hard to find. There is a bit of binary code from that transition period that may hold some answers. In my disassembly of the s2-bits binaries (V2-ish), I've come across the following that bear a particular signature I can't identify otherwise: echo, exit, glob, goto, if, mail, skip, stty, su This "signature" I refer to being a few properties of the a.out files and initial flow of the entry compared with other binaries of known source code origin. First, these are all magic number 405(8) binaries, so V1 era a.out. Second, in each case, the initial branch is to a jump vector which then performs a r5-relative subroutine call followed by a halt in the case of fallthrough. In other words: br _start / 405(8) ... _start: jmp innerstart / some faraway place ... innerstart: jsr r5,main / always 004567 000042 halt ... main: inc somevalue / always 005267 000136 or 005267 000140 ... Finally, these all appear to have the four characters "Init" strewn in there with a bunch of binary data. It is consistently in all of the above binaries exhibiting these other two patterns. I don't have any confirmed B code/binaries to compare against, so I can't say that this is a signature of B, but it is a signature of *something*. Whatever it is these all share it. This can be compared with disassembly of programs like: - ln, a "naked" binary in that it has no a.out header that immediately calls break - chown, a V1 a.out (magic number 405(8)) that jumps immediately into the assembly routine - find, a V2 a.out (magic number 407(8)) that does the same - fc, a V2 a.out compiled from C that branches into a crt0 startup that then jumps to main, this crt0 startup matches the crt0.o file in s2-bits Something of note too with the above programs is that the earliest preserved versions of each are all in C. That being exit, glob, goto, and if from the s1-bits tape (V3-ish) as well as echo, mail, stty, and su in V5 (not sure what skip is/was). This is all pretty compelling, but conjecture nonetheless. Perhaps it will draw hard proof closer. - Matt G. P.S., these are the other s2-bits binaries by "signature": - naked - cal, chmod, dsw, ln, rm - V1 asm - :, ar, bas, cat, chball, check, chown, cmp, cp, date, db, dc, df, du, ed, form, getty, init, login, ls, mesg, mkdir, msh, mv, od, pr, rew, rmdir, roff, sh, sort, stat, sum, tap, tm, tty, wc, who, write - V2 asm - as, as2, ds, find, ld, maki, nm, strip, un - V2 C - cc, fc, size From phil at ultimate.com Thu Jun 8 13:31:53 2023 From: phil at ultimate.com (Phil Budne) Date: Wed, 07 Jun 2023 23:31:53 -0400 Subject: [TUHS] Software written in B In-Reply-To: References: <1e651370-3ada-e211-c277-409d6563500d@f4grx.net> Message-ID: <202306080331.3583Vrw7057546@ultimate.com> An unpleasant aspect of B on the PDP-11 seemed to be that data addresses were stored as "word addresses" (divided by two). Addresses "fix ups" were done before starting any user or other run-time code. I wrote a comment about this at https://github.com/philbudne/pdp11-B/blob/pb/source/brt/brt1.s#L67 (which is my reconstruction of the brt files). Alas, I didn't note the origin of the SCJ recollection of DMR's hack. B code from "libb" (disassembled by Angelo Papenhoff?) shows the initial branch: http://squoze.net/B/libb/printf.s http://squoze.net/B/libb/printn.s Although neither file has any fixups. The signature I would expect from binary B code of this era would be that the generated code from each source file starts with a branch (or jmp) around the contents of the file, to a "jsr r5, chain" followed by a zero terminated list of addresses (which I guessed were addresses of address words that needed to be fixed up). I would expect the code at "chain" to loop through the words referenced by (r5)+ "fixing" them, and finally returning using "rts r5", something like the code I wrote at https://github.com/philbudne/pdp11-B/blob/pb/source/brt/brt1.s#L102 chain: mov (r5)+,r0 // fetch pointer pointer beq 1f // quit on zero word asr (r0) // adjust the referenced word br chain 1: rts r5 // return to end of file, fall into next If the utilities you mention were in fact written in B (which would offer us the chance to recover the actual code used in brt1 and brt2) Which looks VERY MUCH like what you describe: > This "signature" I refer to being a few properties of the a.out files and initial flow of the entry compared with other binaries of known source code origin. First, these are all magic number 405(8) binaries, so V1 era a.out. Second, in each case, the initial branch is to a jump vector which then performs a r5-relative subroutine call followed by a halt in the case of fallthrough. In other words: > > br _start / 405(8) > ... > _start: > jmp innerstart / some faraway place > ... > innerstart: > jsr r5,main / always 004567 000042 > halt > ... > main: > inc somevalue / always 005267 000136 or 005267 000140 > ... The fact that the jsr r5 always points to a small, fixed address is likely because it points to B runtime code loaded at the start of memory, which doesn't exactly match what's described in section 10.0 in https://www.bell-labs.com/usr/dmr/www/kbman.html: ld object /etc/brt1 -lb /etc/bilib /etc/brt2 The initial jmp is the file prologue emitted by the B compiler, and the code at "innerstart" the epilogue, that I would expect to be "jsr r5, chain" I believe the "halt" is a literal zero word (terminating the fixup list) and not a halt instruction, and that the chain routine (auto) increments r5, until it sees a zero word, and then returns (likely via "rts r5") to the word after the zero word. From arnold at skeeve.com Fri Jun 9 00:41:59 2023 From: arnold at skeeve.com (arnold at skeeve.com) Date: Thu, 08 Jun 2023 08:41:59 -0600 Subject: [TUHS] Software written in B In-Reply-To: References: <1e651370-3ada-e211-c277-409d6563500d@f4grx.net> Message-ID: <202306081441.358EfxaA028814@freefriends.org> Andrew Hume wrote: > i was in 1127 in 1981 (for an internship) and then from 1983 onwards. > the only B code i saw or knew of was the driver code that ran in the > Linotron typesetter. it ran on an 8in floppy and my recollection is > that the thing that mattered most was its very small run-time support. > i’m pretty sure it was Ken’s code but i never knew any other details. > See https://www.cs.princeton.edu/~bwk/202/ for info about the Linotron and Ken's code. Great reading, both the original memo and the story behind reconstructing it. Arnold From tuhs at tuhs.org Fri Jun 9 01:05:36 2023 From: tuhs at tuhs.org (segaloco via TUHS) Date: Thu, 08 Jun 2023 15:05:36 +0000 Subject: [TUHS] Software written in B In-Reply-To: <202306080331.3583Vrw7057546@ultimate.com> References: <1e651370-3ada-e211-c277-409d6563500d@f4grx.net> <202306080331.3583Vrw7057546@ultimate.com> Message-ID: > The signature I would expect from binary B code of this era would be > that the generated code from each source file starts with a branch (or > jmp) around the contents of the file, to a "jsr r5, chain" followed by > a zero terminated list of addresses (which I guessed were addresses of > address words that needed to be fixed up). Looking a little closer I think that is what this is, because each file is an a.out header, then a jmp, followed by what I presume is B object code, then the destination of the jmp at that jsr r5 that passes into a routine that I think is then what handles that 0-terminated table of address words. All of the files have a similar bit up to the data word this opening process increments, so I suspect those are the bounds of brt1, from the opening vector (that the header of the B object jumps to) to the data flag that gets set by the inc operation. My assumption is that the B objects were stamped with a jmp that simply jumped to whatever the first address past the end was, so then brt1 had to be physically right there to accept flow. After that point the remaining bits in the B files aren't as similar, but what I can say is I don't see anything on the tail end of these binaries that is consistent enough between them to peg as a brt2. Instead each seems to be a slightly different jumble of interpreter routines themselves. At least I think, this is a very high level assessment though, I haven't fully broken any of these into individual parts yet. By the way, one characteristic of this supposed brt1 code is that it checks that the first word after the jump in the B object is 40022(8) (which is the in-core address of the next word in the B object btw). If it is not present, or the B runtime did not set the data flag indicated above as the end of brt1, then it simply prints "Init\n" on stdout and exits. Only if both this B "magic number" and the flag indicating proper entry are set does it seem to proceed rather than just printing Init and exiting. Not sure what this means, or what the reasoning behind this behavior is, but that explains the "Init" string in each binary, it is also part of the B runtime. - Matt G. From beebe at math.utah.edu Fri Jun 9 01:53:21 2023 From: beebe at math.utah.edu (Nelson H. F. Beebe) Date: Thu, 8 Jun 2023 09:53:21 -0600 Subject: [TUHS] Major update in unix.bib Message-ID: Thanks to Unix document recovery work by some TUHS list members that has been recently added to the archives at minnie.tuhs.org, a large Bell Labs bibliography about Unix has been uncovered and is now available online. I have spent time this week converting the 59-page PDF file to a somewhat searchable OCR'ed PDF, and from a text conversion of that file, to relatively clean BibTeX entries in https://www.math.utah.edu/pub/tex/bib/unix.bib [change .bib to .html for similar view with hypertext links]. The bibliography recorded in entry Scheiderman:1980:UB, with a long remark field, has 457 entries, from the years 1972 to 1980, but due to splitting into subject-specific sections, there are some duplications: 448 remain after data merger. Much work remains to be done, including locating electronic copies of those reports, correcting truncated data, and getting suitable URLs retrofitted into their BibTeX entries. However, I prefer a release-early-and-often approach to bibliographic data distribution, whence this announcement to the TUHS list and others. For searching convenience, I have created an SQLite3 portable database file at https://www.math.utah.edu/pub/tex/bib/unix.db There is a tutorial on SQL searching of BibTeX data in a paper and talk slides at BibTeX meets relational databases https://www.math.utah.edu/~beebe/talks/#2009 and further documentation about BibTeX itself at BibTeX Information and Tutorial https://www.math.utah.edu/pub/bibnet/bibtex-info.html Below are some samples of searches to find the earliest mention of selected topics in the Bell Labs technical memoranda. Even if you have never used SQL queries, they should be fairly understandable, and the new entries all carry identical bibtimestamp values to make their identification and selection easy. % sqlite3 unix.db .headers on .mode table -- Find the earliest entries select label, year, author from bibtab where (filename = 'unix.bib') and (bibtimestamp like '2023.06.06%') order by year, label limit 10; +--------------------+------+-------------------------------------+ | label | year | author | +--------------------+------+-------------------------------------+ | McIlroy:1972:MTC | 1972 | M. Douglas McIlroy | | Ritchie:1972:UAR | 1972 | Dennis M. Ritchie | | McIlroy:1973:SES | 1973 | M. Douglas McIlroy | | Olsson:1973:GCC | 1973 | S. B. Olsson | | Remde:1973:CCS | 1973 | J. R. Remde | | Kernighan:1974:PCT | 1974 | Brian W. Kernighan | | Lycklama:1974:ILC | 1974 | Heinz Lycklama | | Morris:1974:CDT | 1974 | Robert Morris and Lorinda L. Cherry | | Morris:1974:WSH | 1974 | Robert Morris and Ken Thompson | | Swanson:1974:GFC | 1974 | G. K. Swanson | +--------------------+------+-------------------------------------+ -- Find earliest mentions of IBM mainframes select label, year, title from bibtab where (filename = 'unix.bib') and (bibtimestamp like '2023.06.06%') and (title like '%ibm%') order by year, label; ------------------+------+--------------------------------------------------------------+ | label | year | title | +------------------+------+--------------------------------------------------------------+ | Roberts:1975:UIU | 1975 | UNIXLIST --- An IBM / 370 Utility Program to List a UNIX Fil | | | | e Stored on a 9-Track Magnetic Tape. | +------------------+------+--------------------------------------------------------------+ | Bach:1979:PAD | 1979 | Porting the ADAPT Data Translation System to the IBM 370 | +------------------+------+--------------------------------------------------------------+ | Grampp:1979:SCI | 1979 | Support for C on IBM Computers | +------------------+------+--------------------------------------------------------------+ | Huber:1979:ULD | 1979 | UNIX Line Discipline for IBM 2740-1 Protocol | +------------------+------+--------------------------------------------------------------+ -- Find earliest mention of porting work to Intel 808x family select label, year, title from bibtab where (filename = 'unix.bib') and (bibtimestamp like '2023.06.06%') and ((title like '%intel %') or (title like '%z80%')) order by year, label limit 5; +--------------------+------+--------------------------------------------------------------+ | label | year | title | +--------------------+------+--------------------------------------------------------------+ | Molinelli:1977:UAI | 1977 | UNIX Assembler For The Intel 8080 Microprocessor | +--------------------+------+--------------------------------------------------------------+ | Bradley:1978:EMS | 1978 | Evaluation of Microprocessors Supporting the C Language: LSI | | | | -11, MAC-8, Z80 | +--------------------+------+--------------------------------------------------------------+ | Farrell:1978:UGS | 1978 | User's Guide to the SMAL2 Language for the Zilog Z80 Micropr | | | | ocessor | +--------------------+------+--------------------------------------------------------------+ | Vogel:1978:ZAR | 1978 | 8080 / Z80 Assembler Reference Manual | +--------------------+------+--------------------------------------------------------------+ | Blumer:1979:UUI | 1979 | UNIX / 86: UNIX on the Intel 8086 | +--------------------+------+--------------------------------------------------------------+ -- Find earliest mentions of port to Interdata machines [note the full -- text search of entry, rather than just of the title] select label, year, title from bibtab where (filename = 'unix.bib') and (bibtimestamp like '2023.06.06%') and (entry like '%interdata%') order by year, label limit 5; +------------------+------+---------------------------------+ | label | year | title | +------------------+------+---------------------------------+ | Johnson:1977:CLC | 1977 | The C Language Calling Sequence | +------------------+------+---------------------------------+ -- Find earliest mentions of 36-bit Univac Unix select label, year, author from bibtab where (filename = 'unix.bib') and (bibtimestamp like '2023.06.06%') and (title like '%univac%') order by year, label; +----------------+------+---------------------------------+ | label | year | author | +----------------+------+---------------------------------+ | Lyons:1976:GUR | 1976 | T. G. Lyons | | Graaf:1979:PPE | 1979 | D. A. De Graaf and Jerome Feder | +----------------+------+---------------------------------+ -- Find authors of earliest mentions of Programmer's Workbench (PWB) select label, year, author from bibtab where (filename = 'unix.bib') and (bibtimestamp like '2023.06.06%') and ((title like '%PWB%') or (title like '%workbench%')) order by year, label limit 5; +------------------+------+-------------------------------+ | label | year | author | +------------------+------+-------------------------------+ | Dolotta:1975:PWP | 1975 | T. A. Dolotta and others | | Dolotta:1976:PWP | 1976 | T. A. Dolotta and others | | Lyons:1976:GUR | 1976 | T. G. Lyons | | Smith:1976:NTF | 1976 | D. W. Smith | | Dolotta:1977:PUV | 1977 | T. A. Dolotta and D. W. Smith | +------------------+------+-------------------------------+ -- Find titles of earliest mentions of Programmer's Workbench (PWB) select label, year, title from bibtab where (filename = 'unix.bib') and (bibtimestamp like '2023.06.06%') and ((title like '%PWB%') or (title like '%workbench%')) order by year, label limit 5; +------------------+------+--------------------------------------------------------------+ | label | year | title | +------------------+------+--------------------------------------------------------------+ | Dolotta:1975:PWP | 1975 | Programmer's Workbench Papers From The Second International | | | | Conference On Software Engineering (G.4) | +------------------+------+--------------------------------------------------------------+ | Dolotta:1976:PWP | 1976 | Programmer's Workbench Papers From The Second International | | | | Conference on Software Engineering. (G.4) | +------------------+------+--------------------------------------------------------------+ | Lyons:1976:GUR | 1976 | Guide to UNIVAC Remote Job Entry for Programmer's Workbench | | | | Users | +------------------+------+--------------------------------------------------------------+ | Smith:1976:NTF | 1976 | NROFF / TROFF Formatting Codes For Departmental Organization | | | | Directories On PWB / UNIX | +------------------+------+--------------------------------------------------------------+ | Dolotta:1977:PUV | 1977 | PWB / UNIX View Graph and Slide Macros (T.9) | +------------------+------+--------------------------------------------------------------+ -- Find earliest mentions of nroff and troff select label, year, title from bibtab where (filename = 'unix.bib') and (bibtimestamp like '2023.06.06%') and (title like '%roff%') order by year, label limit 5; +---------------------+------+--------------------------------------------------------------+ | label | year | title | +---------------------+------+--------------------------------------------------------------+ | Smith:1976:NTF | 1976 | NROFF / TROFF Formatting Codes For Departmental Organization | | | | Directories On PWB / UNIX | +---------------------+------+--------------------------------------------------------------+ | Cummingham:1977:NPG | 1977 | NROFF For Producing Generic Program Documentation | +---------------------+------+--------------------------------------------------------------+ | Kernighan:1978:TT | 1978 | A TROFF Tutorial | +---------------------+------+--------------------------------------------------------------+ | Lesk:1978:TDU | 1978 | Typing Documents on the UNIX System: Using the tt -ms Macros | | | | with Troff and Nroff | +---------------------+------+--------------------------------------------------------------+ | Ossanna:1979:NTU | 1979 | NROFF / TROFF User's Manual | +---------------------+------+--------------------------------------------------------------+ -- Find earliest mentions of typesetting select label, year, title from bibtab where (filename = 'unix.bib') and (bibtimestamp like '2023.06.06%') and (title like '%typeset%') order by year, label limit 5; +--------------------+------+-------------------------------------------------------+ | label | year | title | +--------------------+------+-------------------------------------------------------+ | Lesk:1976:CTTa | 1976 | Computer Typesetting of Technical Journals on UNIX | | Edelson:1977:TAA | 1977 | Typesetting ACS and APS Meeting Abstracts --- Issue 2 | | Vogel:1977:EPV | 1977 | Easy Phototypeset View Graphs on UNIX | | Kernighan:1978:TMU | 1978 | Typesetting Mathematics --- User's Guide | | Kernighan:1979:STM | 1979 | A System for Typesetting Mathematics | +--------------------+------+-------------------------------------------------------+ As always, comments on, corrections for, and addenda to, unix.bib are most welcome: just send me e-mail. ------------------------------------------------------------------------------- - Nelson H. F. Beebe Tel: +1 801 581 5254 - - University of Utah - - Department of Mathematics, 110 LCB Internet e-mail: beebe at math.utah.edu - - 155 S 1400 E RM 233 beebe at acm.org beebe at computer.org - - Salt Lake City, UT 84112-0090, USA URL: http://www.math.utah.edu/~beebe/ - ------------------------------------------------------------------------------- From sjenkin at canb.auug.org.au Fri Jun 9 06:58:09 2023 From: sjenkin at canb.auug.org.au (steve jenkin) Date: Fri, 9 Jun 2023 06:58:09 +1000 Subject: [TUHS] Major update in unix.bib In-Reply-To: References: Message-ID: Total entries: 4297 At version 4.68, data from a recently discovered Bell Laboratories document, UNIX Bibliography [Scheiderman:1980:UB] have been merged into this file. ============= Massive effort, so impressive. Plus the SQLite database. Nice touch. Matt G’s constant flow of new docs is pushing the boundaries of ‘known docs’ back. > On 9 Jun 2023, at 01:53, Nelson H. F. Beebe wrote: > > Thanks to Unix document recovery work by some TUHS list members that > has been recently added to the archives at minnie.tuhs.org, a large > Bell Labs bibliography about Unix has been uncovered and is now > available online. I have spent time this week converting the 59-page > PDF file to a somewhat searchable OCR'ed PDF, and from a text > conversion of that file, to relatively clean BibTeX entries in > > https://www.math.utah.edu/pub/tex/bib/unix.bib > > [change .bib to .html for similar view with hypertext links]. > > The bibliography recorded in entry Scheiderman:1980:UB, with a long > remark field, has 457 entries, from the years 1972 to 1980, but due to > splitting into subject-specific sections, there are some duplications: > 448 remain after data merger. > > Much work remains to be done, including locating electronic copies of > those reports, correcting truncated data, and getting suitable URLs > retrofitted into their BibTeX entries. -- Steve Jenkin, IT Systems and Design 0412 786 915 (+61 412 786 915) PO Box 38, Kippax ACT 2615, AUSTRALIA mailto:sjenkin at canb.auug.org.au http://members.tip.net.au/~sjenkin From f4grx at f4grx.net Fri Jun 9 18:56:27 2023 From: f4grx at f4grx.net (Sebastien F4GRX) Date: Fri, 9 Jun 2023 10:56:27 +0200 Subject: [TUHS] Software written in B In-Reply-To: <1e651370-3ada-e211-c277-409d6563500d@f4grx.net> References: <1e651370-3ada-e211-c277-409d6563500d@f4grx.net> Message-ID: Hello, You have sent me a lot of valuable information about all of this software environment, some of which I had already found, but most is new for me. It was extremely interesting to read. In particular I had seen the GCOS documentation, but did not consider deeply because they seem to describe a much improved and extended B language. I am able to contact Alan Cox (aka EtchedPixels) via Mastodon, I will ask him about AberMUD. Thank you Phil Budne for the PDP-7 listings, it's amazing to see that these two are even older since they use $( )$ instead of braces for compound statements. And the linotron saga is absolutely fantastic! Thank you everyone! Sebastien Le 07/06/2023 à 12:14, Sebastien F4GRX a écrit : > Hello everyone, > > this is my first post on this list. > > > After looking at the archives for this mailing list, I have seen that > the B language has been discussed several times already. > > After viewing Ken Thompson's interview by Brian Kernighan at VCF East > 2019, I became interested in the B language, as it seemed > full-featured for system programming, close to C, and simple enough to > write a parser for it without a code generation tool. > > So for fun and self-education, I am now writing a (or yet another) B > compiler, in C, after reading Jack Crenshaw's "Let's build a compiler" > documentation ( https://compilers.iecc.com/crenshaw/ ) > > Here it is: https://git.sr.ht/~f4grx/bpars > > It is now starting to generate code for the 68hc11 8-bit platform. It > can also generate C code. > > > I have written some test programs, found some B examples, but I > thought it would be great to use my compiler with actual B software. > > Of course, B was a "transition" language, that did not have a > continued use as soon as it evolved into C. so if any software > remains, it will be quite hard to find. > > And here is my question, is any of you aware of original B source code > archives? or are in touch with people that would know? > > > In particular, I read on this document written by Dennis Ritchie: > https://www.bell-labs.com/usr/dmr/www/chist.html > > > After the TMG version of B was working, Thompson rewrote B in itself > (a bootstrapping step). > > > I have also read that the YACC tool was initially written in B. > > There might be other historical B sources that I am not aware of. > > > Do you know if any of this code has survived to this day? Where could > I find more information about this? > > > Thank you very much, > > Sebastien Lorquet (F4GRX) > From lars at nocrew.org Fri Jun 9 19:57:07 2023 From: lars at nocrew.org (Lars Brinkhoff) Date: Fri, 09 Jun 2023 09:57:07 +0000 Subject: [TUHS] Software written in B In-Reply-To: (Sebastien F4GRX's message of "Fri, 9 Jun 2023 10:56:27 +0200") References: <1e651370-3ada-e211-c277-409d6563500d@f4grx.net> Message-ID: <7wcz255570.fsf@junk.nocrew.org> Sebastien F4GRX wrote: > I am able to contact Alan Cox (aka EtchedPixels) via Mastodon, I will > ask him about AberMUD. He may refer you to Alec Muffett, who has a paper listing: https://alecmuffett.com/article/12714 From douglas.mcilroy at dartmouth.edu Sat Jun 10 05:53:00 2023 From: douglas.mcilroy at dartmouth.edu (Douglas McIlroy) Date: Fri, 9 Jun 2023 15:53:00 -0400 Subject: [TUHS] How should I define 'thing' before I start talking about it? Message-ID: > As far as I see it, EQN input is made of things where a thing is one of > mathematical or troff or eqn symbol > mathematical punctuation > delimiter, these being space, '~', '^', '{', '}', or newline > a character surrounded by punctuation or delimiters > - with/without users errors > a word surrounded by punctuation or delimiters > - with/without users errors > EQN keywords (which are special words???) > That is way too long winded!! I want something tight. Quotes take precedence over all other things. What is a "troff symbol"? I can only think of troff escapes, which can only appear in quotes , so are not eqn things in their own right. (In user errors they may be seen as punctuation, etc.) Ditto for "eqn symbol"? Perhaps the union of some or all of the things that follow in the list? Ditto again for "mathematical punctuation". Comma is one example I can think of. Apostrophe, read as "prime", may be another. Are there more? Is "surrounded by" inclusive or exclusive? Why is "character" distinguished from "word"? The ??? question bears on the issue of whether sintheta is one thing or two. The word "maximal" will probably figure in the answer Another (sticky) point is how punctuation sticks to an adjacent thing. For example, eqn inserts space in (a,b), but keeps the whole thing(?) together in 2 sup (a,b). Doug From douglas.mcilroy at dartmouth.edu Sat Jun 10 06:41:49 2023 From: douglas.mcilroy at dartmouth.edu (Douglas McIlroy) Date: Fri, 9 Jun 2023 16:41:49 -0400 Subject: [TUHS] Re How should I define 'thing' before I start talking about it? Message-ID: Mea culpa. eqn does recognize groff escapes outside of quotes. Doug From rminnich at gmail.com Tue Jun 13 04:22:11 2023 From: rminnich at gmail.com (ron minnich) Date: Mon, 12 Jun 2023 11:22:11 -0700 Subject: [TUHS] crt0 -- what's in that name? Message-ID: This came up lately in the riscv firmware universe. Someone named early boot bt0, I mentioned crt0, and ... when did that name first appear? I first saw it in v6 but I'm sure it was long before. -------------- next part -------------- An HTML attachment was scrubbed... URL: From crossd at gmail.com Tue Jun 13 04:29:45 2023 From: crossd at gmail.com (Dan Cross) Date: Mon, 12 Jun 2023 14:29:45 -0400 Subject: [TUHS] crt0 -- what's in that name? In-Reply-To: References: Message-ID: On Mon, Jun 12, 2023 at 2:22 PM ron minnich wrote: > This came up lately in the riscv firmware universe. Someone named early boot bt0, I mentioned crt0, and ... when did that name first appear? I first saw it in v6 but I'm sure it was long before. The Unix tree shows it in 2nd Edition: https://www.tuhs.org/cgi-bin/utree.pl?file=V2/lib/crt0.s That would suggest it was more or less contemporaneous with C itself. - Dan C. From clemc at ccc.com Tue Jun 13 04:53:00 2023 From: clemc at ccc.com (Clem Cole) Date: Mon, 12 Jun 2023 14:53:00 -0400 Subject: [TUHS] crt0 -- what's in that name? In-Reply-To: References: Message-ID: This makes sense since there was B runtime to start, and Dennis was messing with things. No idea but I wonder if that was the impetus for the rename from B to newB to C - when he introduced a new runtime? ᐧ On Mon, Jun 12, 2023 at 2:30 PM Dan Cross wrote: > On Mon, Jun 12, 2023 at 2:22 PM ron minnich wrote: > > This came up lately in the riscv firmware universe. Someone named early > boot bt0, I mentioned crt0, and ... when did that name first appear? I > first saw it in v6 but I'm sure it was long before. > > The Unix tree shows it in 2nd Edition: > https://www.tuhs.org/cgi-bin/utree.pl?file=V2/lib/crt0.s > That would suggest it was more or less contemporaneous with C itself. > > - Dan C. > -------------- next part -------------- An HTML attachment was scrubbed... URL: From tuhs at tuhs.org Tue Jun 13 05:45:12 2023 From: tuhs at tuhs.org (segaloco via TUHS) Date: Mon, 12 Jun 2023 19:45:12 +0000 Subject: [TUHS] crt0 -- what's in that name? In-Reply-To: References: Message-ID: Probably derivative of /etc/brt1 and /etc/brt2. If there's a rt convention before that I can't say. If they're going for UNIX-y nomenclature though bootloaders were mboot, tboot, uboot, etc. As an aside, Sega used the nomenclature "icd_blkX" where X is a digit to number 128-byte blocks of their initial loader, icd I presume standing for something like initial code. I feel like I've seen "icd" used elsewhere, but couldn't say where. In any case, I'm sure a lot could be devoted to running down the history of names like crt0, mch, mdec, icd, uboot, and so on. Unfortunately those sorts of trivia haven't bubbled up in my manual studies. - Matt G. ------- Original Message ------- On Monday, June 12th, 2023 at 11:53 AM, Clem Cole wrote: > This makes sense since there was B runtime to start, and Dennis was messing with things. No idea but I wonder if that was the impetus for the rename from B to newB to C - when he introduced a new runtime? > ᐧ > > On Mon, Jun 12, 2023 at 2:30 PM Dan Cross wrote: > >> On Mon, Jun 12, 2023 at 2:22 PM ron minnich wrote: >>> This came up lately in the riscv firmware universe. Someone named early boot bt0, I mentioned crt0, and ... when did that name first appear? I first saw it in v6 but I'm sure it was long before. >> >> The Unix tree shows it in 2nd Edition: >> https://www.tuhs.org/cgi-bin/utree.pl?file=V2/lib/crt0.s >> That would suggest it was more or less contemporaneous with C itself. >> >> - Dan C. -------------- next part -------------- An HTML attachment was scrubbed... URL: From tuhs at tuhs.org Tue Jun 13 06:03:19 2023 From: tuhs at tuhs.org (Chris Pinnock via TUHS) Date: Mon, 12 Jun 2023 21:03:19 +0100 Subject: [TUHS] crt0 -- what's in that name? In-Reply-To: References: Message-ID: <8DE09E23-5348-496B-B1CF-EDE9C47983B2@mac.com> I had thought that crt stood for “compiler runtime”. You’ll find files on a NetBSD (and other BSDs) in /usr/lib/csu called crt0, crtbegin, crtend (etc) which are included in the compiled binaries at build time and are used to include machine dependent things need to initialise programs. (The acronym could be wrong of course - bss is the notorious one, where even the man page for a.out has this has a bug: "Nobody seems to agree on what bss stands for.”) C > On 12 Jun 2023, at 20:45, segaloco via TUHS wrote: > > Probably derivative of /etc/brt1 and /etc/brt2. If there's a rt convention before that I can't say. If they're going for UNIX-y nomenclature though bootloaders were mboot, tboot, uboot, etc. As an aside, Sega used the nomenclature "icd_blkX" where X is a digit to number 128-byte blocks of their initial loader, icd I presume standing for something like initial code. I feel like I've seen "icd" used elsewhere, but couldn't say where. In any case, I'm sure a lot could be devoted to running down the history of names like crt0, mch, mdec, icd, uboot, and so on. Unfortunately those sorts of trivia haven't bubbled up in my manual studies. > > - Matt G. > ------- Original Message ------- > On Monday, June 12th, 2023 at 11:53 AM, Clem Cole wrote: > >> This makes sense since there was B runtime to start, and Dennis was messing with things. No idea but I wonder if that was the impetus for the rename from B to newB to C - when he introduced a new runtime? >> ᐧ >> >> On Mon, Jun 12, 2023 at 2:30 PM Dan Cross wrote: >> On Mon, Jun 12, 2023 at 2:22 PM ron minnich wrote: >> > This came up lately in the riscv firmware universe. Someone named early boot bt0, I mentioned crt0, and ... when did that name first appear? I first saw it in v6 but I'm sure it was long before. >> >> The Unix tree shows it in 2nd Edition: >> https://www.tuhs.org/cgi-bin/utree.pl?file=V2/lib/crt0.s >> That would suggest it was more or less contemporaneous with C itself. >> >> - Dan C. > From dave at horsfall.org Tue Jun 13 06:17:12 2023 From: dave at horsfall.org (Dave Horsfall) Date: Tue, 13 Jun 2023 06:17:12 +1000 (EST) Subject: [TUHS] crt0 -- what's in that name? In-Reply-To: References: Message-ID: On Mon, 12 Jun 2023, Dan Cross wrote: > The Unix tree shows it in 2nd Edition: > https://www.tuhs.org/cgi-bin/utree.pl?file=V2/lib/crt0.s > That would suggest it was more or less contemporaneous with C itself. I've always thought of it as "C run time stage 0". -- Dave From crossd at gmail.com Tue Jun 13 06:22:20 2023 From: crossd at gmail.com (Dan Cross) Date: Mon, 12 Jun 2023 16:22:20 -0400 Subject: [TUHS] crt0 -- what's in that name? In-Reply-To: <8DE09E23-5348-496B-B1CF-EDE9C47983B2@mac.com> References: <8DE09E23-5348-496B-B1CF-EDE9C47983B2@mac.com> Message-ID: On Mon, Jun 12, 2023 at 4:04 PM Chris Pinnock via TUHS wrote: > I had thought that crt stood for “compiler runtime”. You’ll find files on a NetBSD (and other BSDs) in /usr/lib/csu called crt0, crtbegin, crtend (etc) which are included in the compiled binaries at build time and are used to include machine dependent things need to initialise programs. Hmm. The comment at the top of `crt0.s` from 2nd Edition says, "C runtime startoff", which seems pretty clear. Whether that has changed over time is, of course, another matter (like how GCC changed to "GNU Compiler Collection"). > (The acronym could be wrong of course - bss is the notorious one, where even the man page for a.out has this has a bug: "Nobody seems to agree on what bss stands for.”) Huh. That seems to have come into the man page sometime after 4.3BSD-Taho; it's in Reno and Net/2, but not before (nor in other systems, that I can see). I thought it was pretty well known that it stands for, "Block Started (by) Symbol"? - Dan C. > > On 12 Jun 2023, at 20:45, segaloco via TUHS wrote: > > > > Probably derivative of /etc/brt1 and /etc/brt2. If there's a rt convention before that I can't say. If they're going for UNIX-y nomenclature though bootloaders were mboot, tboot, uboot, etc. As an aside, Sega used the nomenclature "icd_blkX" where X is a digit to number 128-byte blocks of their initial loader, icd I presume standing for something like initial code. I feel like I've seen "icd" used elsewhere, but couldn't say where. In any case, I'm sure a lot could be devoted to running down the history of names like crt0, mch, mdec, icd, uboot, and so on. Unfortunately those sorts of trivia haven't bubbled up in my manual studies. > > > > - Matt G. > > ------- Original Message ------- > > On Monday, June 12th, 2023 at 11:53 AM, Clem Cole wrote: > > > >> This makes sense since there was B runtime to start, and Dennis was messing with things. No idea but I wonder if that was the impetus for the rename from B to newB to C - when he introduced a new runtime? > >> ᐧ > >> > >> On Mon, Jun 12, 2023 at 2:30 PM Dan Cross wrote: > >> On Mon, Jun 12, 2023 at 2:22 PM ron minnich wrote: > >> > This came up lately in the riscv firmware universe. Someone named early boot bt0, I mentioned crt0, and ... when did that name first appear? I first saw it in v6 but I'm sure it was long before. > >> > >> The Unix tree shows it in 2nd Edition: > >> https://www.tuhs.org/cgi-bin/utree.pl?file=V2/lib/crt0.s > >> That would suggest it was more or less contemporaneous with C itself. > >> > >> - Dan C. > > > From usotsuki at buric.co Tue Jun 13 06:25:31 2023 From: usotsuki at buric.co (Steve Nickolas) Date: Mon, 12 Jun 2023 16:25:31 -0400 (EDT) Subject: [TUHS] crt0 -- what's in that name? In-Reply-To: <8DE09E23-5348-496B-B1CF-EDE9C47983B2@mac.com> References: <8DE09E23-5348-496B-B1CF-EDE9C47983B2@mac.com> Message-ID: On Mon, 12 Jun 2023, Chris Pinnock via TUHS wrote: > I had thought that crt stood for “compiler runtime”. You’ll find files > on a NetBSD (and other BSDs) in /usr/lib/csu called crt0, crtbegin, > crtend (etc) which are included in the compiled binaries at build time > and are used to include machine dependent things need to initialise > programs. I always understood crt to stand for "C runtime". -uso. From tuhs at tuhs.org Tue Jun 13 06:28:32 2023 From: tuhs at tuhs.org (Chris Pinnock via TUHS) Date: Mon, 12 Jun 2023 21:28:32 +0100 Subject: [TUHS] crt0 -- what's in that name? In-Reply-To: References: <8DE09E23-5348-496B-B1CF-EDE9C47983B2@mac.com> Message-ID: > On 12 Jun 2023, at 21:22, Dan Cross wrote: > > Hmm. The comment at the top of `crt0.s` from 2nd Edition says, "C > runtime startoff", which seems pretty clear. Whether that has changed > over time is, of course, another matter (like how GCC changed to "GNU > Compiler Collection"). Possibly - in this file http://cvsweb.netbsd.org/bsdweb.cgi/src/lib/csu/README?rev=1.6&content-type=text/x-cvsweb-markup&only_with_tag=MAIN, the CSU and related files are referred to as the compiler runtime. But startoff is appropriate - because the file was usually included in the binary at the beginning to initialise stuff. These days ELF formats and similar have specific sections for initing and terminating binaries - although I the crt name lives on with start and end: servalan: {482} ls -la /usr/lib/crt* -r--r--r-- 1 root wheel 4328 Jan 14 18:18 /usr/lib/crt0.o -r--r--r-- 1 root wheel 2648 Jan 14 18:18 /usr/lib/crtbegin.o -r--r--r-- 1 root wheel 2880 Jan 14 18:18 /usr/lib/crtbeginS.o lrwxr-xr-x 1 root wheel 10 Jan 14 18:18 /usr/lib/crtbeginT.o -> crtbegin.o -r--r--r-- 1 root wheel 1264 Jan 14 18:18 /usr/lib/crtend.o lrwxr-xr-x 1 root wheel 8 Jan 14 18:18 /usr/lib/crtendS.o -> crtend.o -r--r--r-- 1 root wheel 1488 Jan 14 18:18 /usr/lib/crti.o -r--r--r-- 1 root wheel 1152 Jan 14 18:18 /usr/lib/crtn.o > > > I thought it was pretty well known that it stands for, "Block Started > (by) Symbol”? I wrote a paper on a.out a year or so ago and concluded that I could not find an adequate answer - so avoided the issue with a non-commital footnote. C From paul.winalski at gmail.com Tue Jun 13 06:58:48 2023 From: paul.winalski at gmail.com (Paul Winalski) Date: Mon, 12 Jun 2023 16:58:48 -0400 Subject: [TUHS] crt0 -- what's in that name? In-Reply-To: References: <8DE09E23-5348-496B-B1CF-EDE9C47983B2@mac.com> Message-ID: On 6/12/23, Chris Pinnock via TUHS wrote: > >> On 12 Jun 2023, at 21:22, Dan Cross wrote: >> >> Hmm. The comment at the top of `crt0.s` from 2nd Edition says, "C >> runtime startoff", which seems pretty clear. Whether that has changed >> over time is, of course, another matter (like how GCC changed to "GNU >> Compiler Collection"). > > Possibly - in this file > http://cvsweb.netbsd.org/bsdweb.cgi/src/lib/csu/README?rev=1.6&content-type=text/x-cvsweb-markup&only_with_tag=MAIN, > the CSU and related files are referred to as the compiler runtime. But > startoff is appropriate - because the file was usually included in the > binary at the beginning to initialise stuff. It may be that crt stood for "compiler run time" back when C was the only compiler in town. But once you get another language, such as Fortran, that has its own, different runtime initialization requirements, having the 'c' in crt0 mean "C" rather than "compiler" because it's no longer common to all compiler run time libraries. -Paul W. From ality at pbrane.org Tue Jun 13 07:28:09 2023 From: ality at pbrane.org (Anthony Martin) Date: Mon, 12 Jun 2023 14:28:09 -0700 Subject: [TUHS] crt0 -- what's in that name? In-Reply-To: References: <8DE09E23-5348-496B-B1CF-EDE9C47983B2@mac.com> Message-ID: Chris Pinnock via TUHS once said: > > On 12 Jun 2023, at 21:22, Dan Cross wrote: > > I thought it was pretty well known that it stands for, "Block Started > > (by) Symbol”? > > I wrote a paper on a.out a year or so ago and > concluded that I could not find an adequate answer > - so avoided the issue with a non-commital > footnote. Your paper says there are disagreements about what it stands for. What gave you that impression? >From https://www.tuhs.org/Usenet/comp.unix.wizards/1990-June/033811.html Dennis Ritchie says: Actually the acronym (in the sense we took it up; it may have other credible etymologies) is "Block Started by Symbol." It was a pseudo-op in FAP (Fortran Assembly [-er?] Program), an assembler for the IBM 704-709-7090-7094 machines. It defined its label and set aside space for a given number of words. There was another pseudo-op, BES, "Block Ended by Symbol" that did the same except that the label was defined by the last assigned word + 1. (On these machines Fortran arrays were stored backwards in storage and were 1-origin.) The usage is reasonably appropriate, because just as with standard Unix loaders, the space assigned didn't have to be punched literally into the object deck but was represented by a count somewhere. Cheers, Anthony From clemc at ccc.com Tue Jun 13 07:31:20 2023 From: clemc at ccc.com (Clem Cole) Date: Mon, 12 Jun 2023 17:31:20 -0400 Subject: [TUHS] crt0 -- what's in that name? In-Reply-To: References: Message-ID: On Mon, Jun 12, 2023 at 4:17 PM Dave Horsfall wrote: > On Mon, 12 Jun 2023, Dan Cross wrote: > > > The Unix tree shows it in 2nd Edition: > > https://www.tuhs.org/cgi-bin/utree.pl?file=V2/lib/crt0.s > > That would suggest it was more or less contemporaneous with C itself. > > I've always thought of it as "C run time stage 0". > crt - C RunTme. I always heard it expressed as C runtime SYSTEM or START WRT to BSS -- Block Start Symbol (and sometimes Block End Symbol in some later assemblers) I believe was (were) part of the original 704 assemblers from United Aircraft reserving a labeled block of uninitialized space in a "DUMMY SECTION" (or DSECT) for a hunk of storage. The OS is going to load everything together. So, a big feature of the United Aircraft assembler was to help control memory layout and collect like (common) hunks of things together (i.e., code vs data). The whole idea of BSS was to get the loader to reserve space that did not have to initialized. As I understand it, the standard IBM FORTRAN (FAP) and Assembler (MAP) for the 709 and 7090/94 picked it up, with the new FORTRAN compiler being the big driver. ᐧ -------------- next part -------------- An HTML attachment was scrubbed... URL: From tuhs at tuhs.org Tue Jun 13 07:32:58 2023 From: tuhs at tuhs.org (segaloco via TUHS) Date: Mon, 12 Jun 2023 21:32:58 +0000 Subject: [TUHS] crt0 -- what's in that name? In-Reply-To: References: <8DE09E23-5348-496B-B1CF-EDE9C47983B2@mac.com> Message-ID: > It may be that crt stood for "compiler run time" back when C was the > only compiler in town. > > -Paul W. I don't know about this, only given the fact that B was already there and had a brt1 and brt2, crt0 seems like a natural follow-on to this naming scheme and I would suspect the c in crt0 referring to the language, not "compiler" was probably there from the genesis. These are the single letter associations I've found: - a - assembly as in liba.a - b - B lang as in libb.a, brt1, bilib - c - C lang as in libc.a, crt0 - e - Explor as in libe.a - f - Fortran as in fc, f77, libf.a, fr0 - l - LIL as in lc (LIL compiler) - m - m6 as in /sys/lang/mdir Of course letters get reused for things, this logic would imply the LIL language would have a libl.l but that instead is the lex library as appears a little later on. Y of course eventually gets associated with yacc. Not affirmative proof but I would be more inclined to suspect the c there is a language reference than standing for "compiler" - Matt G. P.S. The "bss" definition that lives in my head is "block-sized storage" but frankly I can't recall where I picked that up. I feel like I didn't just make that up but Google returns nothing. From g.branden.robinson at gmail.com Tue Jun 13 07:39:12 2023 From: g.branden.robinson at gmail.com (G. Branden Robinson) Date: Mon, 12 Jun 2023 16:39:12 -0500 Subject: [TUHS] crt0 -- what's in that name? In-Reply-To: References: Message-ID: <20230612213912.mywv5znz66pk3n5q@illithid> At 2023-06-12T17:31:20-0400, Clem Cole wrote: > On Mon, Jun 12, 2023 at 4:17 PM Dave Horsfall wrote: > > On Mon, 12 Jun 2023, Dan Cross wrote: > > > The Unix tree shows it in 2nd Edition: > > > https://www.tuhs.org/cgi-bin/utree.pl?file=V2/lib/crt0.s > > > That would suggest it was more or less contemporaneous with C itself. > > > > I've always thought of it as "C run time stage 0". > > > crt - C RunTme. It's an ill wind that blows a Fortran runtime using the same convention. Regards, Branden -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: not available URL: From paul.winalski at gmail.com Tue Jun 13 08:09:52 2023 From: paul.winalski at gmail.com (Paul Winalski) Date: Mon, 12 Jun 2023 18:09:52 -0400 Subject: [TUHS] crt0 -- what's in that name? In-Reply-To: References: Message-ID: On 6/12/23, Clem Cole wrote: > > WRT to BSS -- Block Start Symbol (and sometimes Block End Symbol in some > later assemblers) I believe was (were) part of the original 704 assemblers > from United Aircraft reserving a labeled block of uninitialized space in a > "DUMMY SECTION" (or DSECT) for a hunk of storage. The OS is going to load > everything together. So, a big feature of the United Aircraft assembler was > to help control memory layout and collect like (common) hunks of things > together (i.e., code vs data). The whole idea of BSS was to get the loader > to reserve space that did not have to initialized. As I understand it, the > standard IBM FORTRAN (FAP) and Assembler (MAP) for the 709 and 7090/94 > picked it up, with the new FORTRAN compiler being the big driver. > ᐧ I don't recall either BSS or BES pseudo-ops in the System/360 assemblers for DOS/360 and OS/360. I think that by then they had been replaced by a more generalized concept of PSECTs and DSECTs. I forget now the details of how Fortran uninitialized common blocks were implemented in S/360 object language. It was the job of the link editor to overlay all such symbols with the same name, and if there was no explicit initializer, to allocate space for them in uniniitalized memory. One thing to note is that, at least in DOS/360, such memory was not zeroed out. It contained random garbage Security wasn't a concern in closed, raised-floor computer shops and you didn't want the program loader to waste time zeroing out memory anyway. -Paul W. From clemc at ccc.com Tue Jun 13 08:39:32 2023 From: clemc at ccc.com (Clem Cole) Date: Mon, 12 Jun 2023 18:39:32 -0400 Subject: [TUHS] crt0 -- what's in that name? In-Reply-To: <20230612213912.mywv5znz66pk3n5q@illithid> References: <20230612213912.mywv5znz66pk3n5q@illithid> Message-ID: Apologies to TUHS - other than please don't think Fortran did not impact UNIX and its peers. We owe that community our jobs, and for creating the market in that we all would build systems and eventually improve. Note: I'm CCing COFF - you want to continue this... On Mon, Jun 12, 2023 at 5:39 PM G. Branden Robinson < g.branden.robinson at gmail.com> wrote: > It's an ill wind that blows a Fortran runtime using the same convention. > Be careful there, weedhopper ... Fortran gave a lot to computing (including UNIX) and frankly still does. I did not write have too much Fortran as a professional (mostly early in my career), but I did spent 50+ years ensuring that the results of the Fortran compiler ran >>really well<< on the systems I built. As a former collegiate of Paul W and I once said, "*Any computer executive that does not take Fortran seriously will not have their job very long.* It pays our salary." It's still the #1 language for science [its also not the same language my Father learned in the late 50s/early 60s, much less the one I learned 15 years later - check out: In what type of work is the Fortran Programming Language most used today , Is Fortran still alive , Is Fortran obsolete FWIW: These days, the Intel Fortran compiler (and eventually the LLVM one, which Intel is the primary developer), calls the C/C++ common runtime for support. Most libraries are written in C, C++, (or assembler in some very special cases) - so now it's C that keeps Fortran alive. But "in the beginning" it was all about Fortran because that paid the bills then and still does today. ᐧ -------------- next part -------------- An HTML attachment was scrubbed... URL: From g.branden.robinson at gmail.com Tue Jun 13 08:50:13 2023 From: g.branden.robinson at gmail.com (G. Branden Robinson) Date: Mon, 12 Jun 2023 17:50:13 -0500 Subject: [TUHS] crt0 -- what's in that name? In-Reply-To: References: <20230612213912.mywv5znz66pk3n5q@illithid> Message-ID: <20230612225013.ru6jfhv2tmjjftxn@illithid> Hi Clem, At 2023-06-12T18:39:32-0400, Clem Cole wrote: > Apologies to TUHS - other than please don't think Fortran did not > impact UNIX and its peers. We owe that community our jobs, and for > creating the market in that we all would build systems and eventually > improve. Absolutely. Fortran (77) was the first language this weedhopper learned after BASIC (which, while much despised by the sorts of people who update jargon files, _also_ had early support in CSRC Unix). While I intensely disliked the fixed-source format (a defect Fortran 90 remedied), I acquired it more easily than C, to the relief of the guys on my class project team who already knew C and _hated_ Fortran. My wisecrack was not meant as a derogation of Fortran in any way, but rather as a sly (not really) allusion to a word also appearing in your expansion of the COFF list's name as seen above... Best regards to you and to Fortran, and a nod to the copy of Metcalf/Reid/Cohen on my bookshelf, Branden -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: not available URL: From paul.winalski at gmail.com Tue Jun 13 09:04:31 2023 From: paul.winalski at gmail.com (Paul Winalski) Date: Mon, 12 Jun 2023 19:04:31 -0400 Subject: [TUHS] crt0 -- what's in that name? In-Reply-To: References: <20230612213912.mywv5znz66pk3n5q@illithid> Message-ID: On 6/12/23, Clem Cole wrote: > > On Mon, Jun 12, 2023 at 5:39 PM G. Branden Robinson < > g.branden.robinson at gmail.com> wrote: > >> It's an ill wind that blows a Fortran runtime using the same convention. >> > Be careful there, weedhopper ... I don't think this remark was intended to denigrate Fortran in any way. I took it as a wryly humorous way to make the observation that C and Fortran have different program startup semantics, and that there is other stuff that has to be done when firing up a program written wholly or partially in Fortran beyond what is needed to start up a C application. Most operating system ABIs, Unix included, don't have a formalized mechanism for dealing with the differences between startup semantics of various programming languages. They deal with the problem in an ad-hack fashion. The one exception that I know of is VMS (now OpenVMS). Tom Hastings was the architect who designed the original VAX/VMS ABI. He was aware from the get-go that several programming languages had to be supported and he made sure that his design was general enough to allow programmers to write routines in the most suitable language for them, to mix and match modules written in different languages in the same program, and to easily make calls from one language to another. It was a stroke of genius and I haven't seen its like in any other OS (several times I've wished it was there, though). Further discussion in COFF. -Paul W. From douglas.mcilroy at dartmouth.edu Tue Jun 13 10:46:42 2023 From: douglas.mcilroy at dartmouth.edu (Douglas McIlroy) Date: Mon, 12 Jun 2023 20:46:42 -0400 Subject: [TUHS] crt0 -- what's in that name? Message-ID: >I thought it was pretty well known that it [BSS] stands for, "Block Started (by) Symbol"? BSS was a "pseudo-operation" in SAP (SHARE assembly program) for the IBM 704. My recollection is that the assembler manual called it "block starting at symbol". There was also a BES (block ending at symbol) pseudo-op. Both reserved a block of memory, with the assembler assigning the appropriate value to the pseudo-op's label. The reason for BES was that index registers were subtractive. There was a loop-ending instruction ,TIX (transfer on index), that decreased the index by a specified amount and transferred to a specified location unless the index hit zero, in which case the instruction counter continued in sequence. BES was originally conceived for addressing an array stored by increasing subscript but indexed by a register that counted down. BES was also useful for FORTRAN object code, which stored arrays backward and kept the true, uncomplemented subscript in an index register. Doug From norman at oclsc.org Tue Jun 13 11:37:40 2023 From: norman at oclsc.org (Norman Wilson) Date: Mon, 12 Jun 2023 21:37:40 -0400 (EDT) Subject: [TUHS] crt0 -- what's in that name? Message-ID: <21D1C841C4310FE1829023B424252295.for-standards-violators@oclsc.org> Clem Cole: > Apologies to TUHS - other than please don't think Fortran did not > impact UNIX and its peers. Fortran had an important (if indirect) influence in early Unix. From Dennis's memories of the early days of Unix on the PDP-7: Soon after TMG became available, Thompson decided that we could not pretend to offer a real computing service without Fortran, so he sat down to write a Fortran in TMG. As I recall, the intent to handle Fortran lasted about a week. What he produced instead was a definition of and a compiler for the new language B. (The Evolution of the Unix Time-Sharing System; see the 1984 UNIX System issue of the BLTJ for the whole thing, or just read https://www.bell-labs.com/usr/dmr/www/hist.html) Now let's move on to the name `rc'. Not the shell, but the usage as part of a file name. Those two characters appear at the end of the many annoying, and mostly pointless, configuration files that litter one's home directory these days, apparently copied from the old system-startup script /etc/rc as if the name means `startup commands' (or something beginning with r, I suppose, instead of startup). But I recall reading somewhere that it just stood for `runcom,' a Multics-derived term for what we now call a shell script. I can't find a citation to back up that claim, though. Anyone else remember where to look? Norman Wilson Toronto ON From robpike at gmail.com Tue Jun 13 11:41:25 2023 From: robpike at gmail.com (Rob Pike) Date: Tue, 13 Jun 2023 11:41:25 +1000 Subject: [TUHS] crt0 -- what's in that name? In-Reply-To: <21D1C841C4310FE1829023B424252295.for-standards-violators@oclsc.org> References: <21D1C841C4310FE1829023B424252295.for-standards-violators@oclsc.org> Message-ID: Not a citation but the 127 (as opposed to 1127) crowd all called them runcoms. -rob On Tue, Jun 13, 2023 at 11:37 AM Norman Wilson wrote: > Clem Cole: > > > Apologies to TUHS - other than please don't think Fortran did not > > impact UNIX and its peers. > > Fortran had an important (if indirect) influence in early Unix. From > Dennis's memories of the early days of Unix on the PDP-7: > > Soon after TMG became available, Thompson decided that we could not > pretend to offer a real computing service without Fortran, so he sat > down to write a Fortran in TMG. As I recall, the intent to handle > Fortran lasted about a week. What he produced instead was a definition > of and a compiler for the new language B. > > (The Evolution of the Unix Time-Sharing System; see the 1984 > UNIX System issue of the BLTJ for the whole thing, or just read > https://www.bell-labs.com/usr/dmr/www/hist.html) > > Now let's move on to the name `rc'. Not the shell, but the > usage as part of a file name. Those two characters appear > at the end of the many annoying, and mostly pointless, configuration > files that litter one's home directory these days, apparently > copied from the old system-startup script /etc/rc as if the > name means `startup commands' (or something beginning with r, > I suppose, instead of startup). But I recall reading somewhere > that it just stood for `runcom,' a Multics-derived term for what > we now call a shell script. > > I can't find a citation to back up that claim, though. Anyone > else remember where to look? > > Norman Wilson > Toronto ON > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From crossd at gmail.com Tue Jun 13 11:48:58 2023 From: crossd at gmail.com (Dan Cross) Date: Mon, 12 Jun 2023 21:48:58 -0400 Subject: [TUHS] crt0 -- what's in that name? In-Reply-To: <21D1C841C4310FE1829023B424252295.for-standards-violators@oclsc.org> References: <21D1C841C4310FE1829023B424252295.for-standards-violators@oclsc.org> Message-ID: On Mon, Jun 12, 2023, 9:37 PM Norman Wilson wrote: > Clem Cole: > > > Apologies to TUHS - other than please don't think Fortran did not > > impact UNIX and its peers. > > Fortran had an important (if indirect) influence in early Unix. From > Dennis's memories of the early days of Unix on the PDP-7: > > Soon after TMG became available, Thompson decided that we could not > pretend to offer a real computing service without Fortran, so he sat > down to write a Fortran in TMG. As I recall, the intent to handle > Fortran lasted about a week. What he produced instead was a definition > of and a compiler for the new language B. > > (The Evolution of the Unix Time-Sharing System; see the 1984 > UNIX System issue of the BLTJ for the whole thing, or just read > https://www.bell-labs.com/usr/dmr/www/hist.html) > > Now let's move on to the name `rc'. Not the shell, but the > usage as part of a file name. Those two characters appear > at the end of the many annoying, and mostly pointless, configuration > files that litter one's home directory these days, apparently > copied from the old system-startup script /etc/rc as if the > name means `startup commands' (or something beginning with r, > I suppose, instead of startup). But I recall reading somewhere > that it just stood for `runcom,' a Multics-derived term for what > we now call a shell script. > > I can't find a citation to back up that claim, though. Anyone > else remember where to look? > Not a citation, either, but I believe the original RUNCOM came from CTSS ( https://multicians.org/shell.html), and MDN-4 on the design of the Multics shell mentions the term and MDN-5 goes into detail here. Newer Multics calls this "exec_com", as in the shell startup file ` start_up.ec` that Multics users have in their login directories. https://web.mit.edu/multics-history/source/Multics/doc/info_segments/exec_com.info - Dan C. -------------- next part -------------- An HTML attachment was scrubbed... URL: From dave at horsfall.org Tue Jun 13 15:28:02 2023 From: dave at horsfall.org (Dave Horsfall) Date: Tue, 13 Jun 2023 15:28:02 +1000 (EST) Subject: [TUHS] crt0 -- what's in that name? In-Reply-To: References: Message-ID: On Mon, 12 Jun 2023, Clem Cole wrote: > > I've always thought of it as "C run time stage 0". > > crt - C RunTme.   I always heard it expressed as C  runtime SYSTEM or > START Ah; well, I was almost right :-) It's been a while since I was using Ed 5 where I first saw that comment... -- Dave From rudi.j.blom at gmail.com Tue Jun 13 20:03:41 2023 From: rudi.j.blom at gmail.com (Rudi Blom) Date: Tue, 13 Jun 2023 17:03:41 +0700 Subject: [TUHS] crt0 -- what's in that name? Message-ID: Maybe not really 'defining' but useful https://en.wikipedia.org/wiki/Crt0 https://en.wikipedia.org/wiki/.bss -- The more I learn the better I understand I know nothing. -------------- next part -------------- An HTML attachment was scrubbed... URL: From douglas.mcilroy at dartmouth.edu Tue Jun 13 22:10:52 2023 From: douglas.mcilroy at dartmouth.edu (Douglas McIlroy) Date: Tue, 13 Jun 2023 08:10:52 -0400 Subject: [TUHS] crt0 -- what's in that name? Message-ID: > Not a citation, either, but I believe the original RUNCOM came from CTSS (https://multicians.org/shell.html), Yes, the CTSS command "runcom" arranged for commands stored in a file to be run in the background. Such a file (which could contain at most six commands) became known as "a runcom". The term "script" did not emerge until Unix. I vaguely recall that Lee McMahon coined the usage, but would welcome more reliable info about its origin. Doug From rminnich at gmail.com Wed Jun 14 02:37:26 2023 From: rminnich at gmail.com (ron minnich) Date: Tue, 13 Jun 2023 09:37:26 -0700 Subject: [TUHS] crt0 -- what's in that name? In-Reply-To: References: Message-ID: thanks all. In 1976, at udel, we thought it meant c runtime startoff and that seems to be a reasonable interpretation. Kind of nice that a very new system, oreboot, will have a bt0 performing about the same function :-) On Mon, Jun 12, 2023 at 10:28 PM Dave Horsfall wrote: > On Mon, 12 Jun 2023, Clem Cole wrote: > > > > I've always thought of it as "C run time stage 0". > > > > crt - C RunTme. I always heard it expressed as C runtime SYSTEM or > > START > > Ah; well, I was almost right :-) It's been a while since I was using > Ed 5 where I first saw that comment... > > -- Dave -------------- next part -------------- An HTML attachment was scrubbed... URL: From douglas.mcilroy at dartmouth.edu Wed Jun 14 11:58:51 2023 From: douglas.mcilroy at dartmouth.edu (Douglas McIlroy) Date: Tue, 13 Jun 2023 21:58:51 -0400 Subject: [TUHS] undiagnosed pic error Message-ID: There may be a simple generic way to correct pic's habit of accepting any set of object modifiers on any object, but obeying only a compatible subset. Pic already collects a bit vector of modifier types attached to the current object. If that were extended with a few more bits that designate the object types, the size, B, of the bit vector would be about 35--an easy fit in one 64-bit word. Then a BxB bit matrix could record both modifier/modifier incompatibilities and object/modifier incompatibilities. The collected bit vector needs to be tested against the matrix once per object definition. It seems to be harder to catch duplication of modifiers, requiring extra code at all points where bits are set. Nevertheless, this kind of error also merits detection. Some questions Does anybody think the issue is not worth addressing? Is there a better scheme than that suggested above? Is the scheme adequate? It would not, for example, catch a three-way incompatibility that does not entail any pairwise incompatibility, should such an incompatibility exist. Any other thoughts? Doug From athornton at gmail.com Wed Jun 14 11:59:14 2023 From: athornton at gmail.com (Adam Thornton) Date: Tue, 13 Jun 2023 18:59:14 -0700 Subject: [TUHS] [COFF] Re: Re: crt0 -- what's in that name? In-Reply-To: References: <20230612213912.mywv5znz66pk3n5q@illithid> <20230612234953.pwu7oi6hyglsaqzs@illithid> Message-ID: On Tue, Jun 13, 2023 at 9:29 AM Paul Winalski wrote: > > VMS (officially OpenVMS; I hated that marketing name when it was first > proposed and I hate it now) is still alive and supported by a company > called VMS Software, Inc. (VSI). Here is a pointer to their document > OpenVMS Programming Concepts, Volume II, which describes the CLE in > detail: > > I think it's worth mentioning that the OpenVMS Hobbyist Program is still alive and well and recently began supplying x86_64 licenses to hobbyists, so if you have a reasonably modern amd64 system, you can run it under QEMU. -------------- next part -------------- An HTML attachment was scrubbed... URL: From marc.donner at gmail.com Wed Jun 14 20:41:53 2023 From: marc.donner at gmail.com (Marc Donner) Date: Wed, 14 Jun 2023 06:41:53 -0400 Subject: [TUHS] undiagnosed pic error In-Reply-To: References: Message-ID: How sparse is the 35x35 matrix? For comprehensibility would it be the best way to do it? On Tue, Jun 13, 2023 at 9:59 PM Douglas McIlroy < douglas.mcilroy at dartmouth.edu> wrote: > There may be a simple generic way to correct pic's habit of accepting > any set of object modifiers on any object, but obeying only a > compatible subset. > > Pic already collects a bit vector of modifier types attached to the > current object. If that were extended with a few more bits that > designate the object types, the size, B, of the bit vector would be > about 35--an easy fit in one 64-bit word. Then a BxB bit matrix could > record both modifier/modifier incompatibilities and object/modifier > incompatibilities. The collected bit vector needs to be tested against > the matrix once per object definition. > > It seems to be harder to catch duplication of modifiers, requiring > extra code at all points where bits are set. Nevertheless, this kind > of error also merits detection. > > Some questions > > Does anybody think the issue is not worth addressing? > > Is there a better scheme than that suggested above? > > Is the scheme adequate? It would not, for example, catch a three-way > incompatibility that does not entail any pairwise incompatibility, > should such an incompatibility exist. > > Any other thoughts? > > Doug > -- ===== nygeek.net mindthegapdialogs.com/home -------------- next part -------------- An HTML attachment was scrubbed... URL: From aap at papnet.eu Wed Jun 14 21:51:47 2023 From: aap at papnet.eu (Angelo Papenhoff) Date: Wed, 14 Jun 2023 13:51:47 +0200 Subject: [TUHS] Software written in B In-Reply-To: References: <1e651370-3ada-e211-c277-409d6563500d@f4grx.net> <202306080331.3583Vrw7057546@ultimate.com> Message-ID: Thank you two for finding this! I did some disassembling yesterday and have uploaded brt1.s and brt2.s to my site now: http://squoze.net/B/brt/ (I haven't actually assembled them yet, there may be mistakes) Some observations: - The 'chain' format is actually a linked list and not a list of addresses. Phil and I both got this wrong. - The "Init" string is an error message if for some reason the B init chain didn't run or main doesn't look like a function - The cmdline arguments overwrite part of the init code. There's about 80 bytes of space for them before it overwrites the code that builds the argv vector - brt2.s is only to mark the beginning of the stack I also saw some differences in the bilib code but haven't really analyzed that part (yet?) Would be really great if we could get all the files disassembled and decompiled and restore the source code for everything :) Best, Angelo On 08/06/23, segaloco via TUHS wrote: > > The signature I would expect from binary B code of this era would be > > that the generated code from each source file starts with a branch (or > > jmp) around the contents of the file, to a "jsr r5, chain" followed by > > a zero terminated list of addresses (which I guessed were addresses of > > address words that needed to be fixed up). > > Looking a little closer I think that is what this is, because each file is an a.out header, then a jmp, followed by what I presume is B object code, then the destination of the jmp at that jsr r5 that passes into a routine that I think is then what handles that 0-terminated table of address words. All of the files have a similar bit up to the data word this opening process increments, so I suspect those are the bounds of brt1, from the opening vector (that the header of the B object jumps to) to the data flag that gets set by the inc operation. > > My assumption is that the B objects were stamped with a jmp that simply jumped to whatever the first address past the end was, so then brt1 had to be physically right there to accept flow. After that point the remaining bits in the B files aren't as similar, but what I can say is I don't see anything on the tail end of these binaries that is consistent enough between them to peg as a brt2. Instead each seems to be a slightly different jumble of interpreter routines themselves. At least I think, this is a very high level assessment though, I haven't fully broken any of these into individual parts yet. > > By the way, one characteristic of this supposed brt1 code is that it checks that the first word after the jump in the B object is 40022(8) (which is the in-core address of the next word in the B object btw). If it is not present, or the B runtime did not set the data flag indicated above as the end of brt1, then it simply prints "Init\n" on stdout and exits. Only if both this B "magic number" and the flag indicating proper entry are set does it seem to proceed rather than just printing Init and exiting. > > Not sure what this means, or what the reasoning behind this behavior is, but that explains the "Init" string in each binary, it is also part of the B runtime. > > - Matt G. From aap at papnet.eu Thu Jun 15 06:03:19 2023 From: aap at papnet.eu (Angelo Papenhoff) Date: Wed, 14 Jun 2023 22:03:19 +0200 Subject: [TUHS] Software written in B In-Reply-To: References: <1e651370-3ada-e211-c277-409d6563500d@f4grx.net> <202306080331.3583Vrw7057546@ultimate.com> Message-ID: After writing this mail I actually started reversing the B binaries. You can find them here: http://squoze.net/B/programs/ I did find some differences in versions of the B runtime and library. Especially interesting was an implementation of the cksto routine in su and stty that checks whether an address in an assignment is in a reasonable range ("LV out of range" error if not) What is perhaps interesting historically is that the su binary contains a hardcoded password ^Q^R^S^T, which is not printable for a good reason: it is given as a command line argument. I will hopefully continue with this in the next time (if, goto, mail and glob are left). Best, aap On 14/06/23, Angelo Papenhoff wrote: > Thank you two for finding this! > I did some disassembling yesterday and have uploaded brt1.s and brt2.s > to my site now: http://squoze.net/B/brt/ (I haven't actually assembled > them yet, there may be mistakes) > > Some observations: > > - The 'chain' format is actually a linked list and not a list of > addresses. Phil and I both got this wrong. > > - The "Init" string is an error message if for some reason the B init > chain didn't run or main doesn't look like a function > > - The cmdline arguments overwrite part of the init code. There's about > 80 bytes of space for them before it overwrites the code that builds the > argv vector > > - brt2.s is only to mark the beginning of the stack > > > I also saw some differences in the bilib code but haven't really > analyzed that part (yet?) > > Would be really great if we could get all the files disassembled and > decompiled and restore the source code for everything :) > > Best, > Angelo From tuhs at tuhs.org Thu Jun 15 07:53:21 2023 From: tuhs at tuhs.org (segaloco via TUHS) Date: Wed, 14 Jun 2023 21:53:21 +0000 Subject: [TUHS] Software written in B In-Reply-To: References: <1e651370-3ada-e211-c277-409d6563500d@f4grx.net> <202306080331.3583Vrw7057546@ultimate.com> Message-ID: Angelo, do you mind if I mirror these disassembles into my https://gitlab.com/segaloco/v2src repository? That's where I'm (very slowly) accumulating the results of mt own disassembly efforts on the V2 binaries. Bonus points if you raise a PR, but I can make sure you get a shout-out in the Readme or something otherwise. Thanks for digging deeper where I haven't found the time. - Matt G. ------- Original Message ------- On Wednesday, June 14th, 2023 at 1:03 PM, Angelo Papenhoff wrote: > After writing this mail I actually started reversing the B binaries. > You can find them here: http://squoze.net/B/programs/ > > I did find some differences in versions of the B runtime and library. > Especially interesting was an implementation of the cksto routine > in su and stty that checks whether an address in an assignment is in a > reasonable range ("LV out of range" error if not) > > What is perhaps interesting historically is that the su binary contains > a hardcoded password ^Q^R^S^T, which is not printable for a good reason: > it is given as a command line argument. > > I will hopefully continue with this in the next time (if, goto, mail and > glob are left). > > Best, > aap > > On 14/06/23, Angelo Papenhoff wrote: > > > Thank you two for finding this! > > I did some disassembling yesterday and have uploaded brt1.s and brt2.s > > to my site now: http://squoze.net/B/brt/ (I haven't actually assembled > > them yet, there may be mistakes) > > > > Some observations: > > > > - The 'chain' format is actually a linked list and not a list of > > addresses. Phil and I both got this wrong. > > > > - The "Init" string is an error message if for some reason the B init > > chain didn't run or main doesn't look like a function > > > > - The cmdline arguments overwrite part of the init code. There's about > > 80 bytes of space for them before it overwrites the code that builds the > > argv vector > > > > - brt2.s is only to mark the beginning of the stack > > > > I also saw some differences in the bilib code but haven't really > > analyzed that part (yet?) > > > > Would be really great if we could get all the files disassembled and > > decompiled and restore the source code for everything :) > > > > Best, > > Angelo From aap at papnet.eu Thu Jun 15 08:05:03 2023 From: aap at papnet.eu (Angelo Papenhoff) Date: Thu, 15 Jun 2023 00:05:03 +0200 Subject: [TUHS] Software written in B In-Reply-To: References: <1e651370-3ada-e211-c277-409d6563500d@f4grx.net> <202306080331.3583Vrw7057546@ultimate.com> Message-ID: Sure, go ahead. I hope I will have more soon. On 14/06/23, segaloco wrote: > Angelo, do you mind if I mirror these disassembles into my https://gitlab.com/segaloco/v2src repository? That's where I'm (very slowly) accumulating the results of mt own disassembly efforts on the V2 binaries. Bonus points if you raise a PR, but I can make sure you get a shout-out in the Readme or something otherwise. Thanks for digging deeper where I haven't found the time. > > - Matt G. > > ------- Original Message ------- > On Wednesday, June 14th, 2023 at 1:03 PM, Angelo Papenhoff wrote: > > > > After writing this mail I actually started reversing the B binaries. > > You can find them here: http://squoze.net/B/programs/ > > > > I did find some differences in versions of the B runtime and library. > > Especially interesting was an implementation of the cksto routine > > in su and stty that checks whether an address in an assignment is in a > > reasonable range ("LV out of range" error if not) > > > > What is perhaps interesting historically is that the su binary contains > > a hardcoded password ^Q^R^S^T, which is not printable for a good reason: > > it is given as a command line argument. > > > > I will hopefully continue with this in the next time (if, goto, mail and > > glob are left). > > > > Best, > > aap From f4grx at f4grx.net Thu Jun 15 18:00:43 2023 From: f4grx at f4grx.net (Sebastien F4GRX) Date: Thu, 15 Jun 2023 10:00:43 +0200 Subject: [TUHS] Software written in B In-Reply-To: References: <1e651370-3ada-e211-c277-409d6563500d@f4grx.net> <202306080331.3583Vrw7057546@ultimate.com> Message-ID: Hello, That is really interesting, thank you for providing that. I see that su.b contains non-ascii characters ($11,$12,$13,$14) which tripped my linux console when I attempted a copy paste, since $13 is XOFF :) Is there a way to escape these characters? Or update them, since it looks like a password provided on command line. In its current state my compiler managed to eat all these sources except goto which segfaults for a reason I have not determined yet. May I have your authorization for copying these files into my own b compiler repository ( https://git.sr.ht/~f4grx/bpars )? What licence and attribution info shall I indicate? Sebastien PS: here is the 68hc11 assembly that the current version of my B compiler generates for echo.b. Code is still not functional, parameters and locals are not allocated, but instructions are consistent. this mcu is similar to 6502 and other cpus in the motorola 6800 line: two 8-bit accumulators A,B combinable into D, and two 16-bit index reg X and Y (Y is not used), 16-bit SP. /*---------------------------------------------------------------------------*/         .text         .global main         .func   main main: #function has no args #local var i size 2 stack offset 0 TODO #total stack frame size for compound: 2         LDD     #1         STD     i        /*Direct assign - local*/ .Lwhile_1: /*binop, both complex*/ /*generate RHS in D, then move in temp*/ in gen_index: computing base expression (as lvalue) in gen_index: lvalue=1         LDX     #argv /*lvalue-extern*/ in gen_index: computing base expression done         LDD     0,X         PSHB         PSHA /*generate LHS in D*/         LDX     #i /*lvalue-local*/         LDD     0,X     /*Get value in X for easy 16-bit op*/         XGDX    /*Now value is in X and address in D*/         INX     /*Do the preinc*/         XGDX    /*Now updated value is in D and address in X*/         STD     0,X     /*Save the new value, next code can use it*/ /*execute*/         PULX    /*--- Recall complex B computed before */         STX     TEMP    /*---put in temp for binop*/         CPD     TEMP         BGT     .Lcond2 #start emit arglist for call         .section .rodata .strconst_3:         .asciz "%s "         .text         LDD     #.strconst_3         PSHB    /*argoff=0*/         PSHA in gen_index: computing base expression (as lvalue) in gen_index: lvalue=1         LDX     #argv /*lvalue-extern*/ in gen_index: computing base expression done         LDD     i /*local*/         LSLD         STX     TEMP         ADDD    TEMP         XGDX         LDD     0,X         PSHB    /*argoff=2*/         PSHA #end emit arglist, size=4         JSR     printf  /*undef*/         PULX         PULX /*end of loop, eval condition again*/         BRA     .Lwhile_1 .Lcond_2: #start emit arglist for call         LDD     #10         PSHB    /*argoff=0*/         PSHA #end emit arglist, size=2         JSR     putchar /*undef*/         PULX .Lstmtend_0: /*TODO avoid generating this if the statement does not contain breaks (no need for recursion)*/ .Lmain__rts: /* TODO avoid this label if statement contains no return */         RTS         .endfunc        # main Le 14/06/2023 à 22:03, Angelo Papenhoff a écrit : > After writing this mail I actually started reversing the B binaries. > You can find them here: http://squoze.net/B/programs/ > > I did find some differences in versions of the B runtime and library. > Especially interesting was an implementation of the cksto routine > in su and stty that checks whether an address in an assignment is in a > reasonable range ("LV out of range" error if not) > > What is perhaps interesting historically is that the su binary contains > a hardcoded password ^Q^R^S^T, which is not printable for a good reason: > it is given as a command line argument. > > I will hopefully continue with this in the next time (if, goto, mail and > glob are left). > > Best, > aap > > On 14/06/23, Angelo Papenhoff wrote: >> Thank you two for finding this! >> I did some disassembling yesterday and have uploaded brt1.s and brt2.s >> to my site now: http://squoze.net/B/brt/ (I haven't actually assembled >> them yet, there may be mistakes) >> >> Some observations: >> >> - The 'chain' format is actually a linked list and not a list of >> addresses. Phil and I both got this wrong. >> >> - The "Init" string is an error message if for some reason the B init >> chain didn't run or main doesn't look like a function >> >> - The cmdline arguments overwrite part of the init code. There's about >> 80 bytes of space for them before it overwrites the code that builds the >> argv vector >> >> - brt2.s is only to mark the beginning of the stack >> >> >> I also saw some differences in the bilib code but haven't really >> analyzed that part (yet?) >> >> Would be really great if we could get all the files disassembled and >> decompiled and restore the source code for everything :) >> >> Best, >> Angelo From aap at papnet.eu Thu Jun 15 18:21:09 2023 From: aap at papnet.eu (Angelo Papenhoff) Date: Thu, 15 Jun 2023 10:21:09 +0200 Subject: [TUHS] Software written in B In-Reply-To: References: <1e651370-3ada-e211-c277-409d6563500d@f4grx.net> <202306080331.3583Vrw7057546@ultimate.com> Message-ID: On 15/06/23, Sebastien F4GRX wrote: > I see that su.b contains non-ascii characters ($11,$12,$13,$14) which > tripped my linux console when I attempted a copy paste, since $13 is > XOFF :) Is there a way to escape these characters? Or update them, since > it looks like a password provided on command line. The last1120 C compiler does not have an escape mechanism for strings, so it's highly unlikely the B compiler had it. > May I have your authorization for copying these files into my own b > compiler repository ( https://git.sr.ht/~f4grx/bpars )? What licence and > attribution info shall I indicate? Sure, sure. I reversed goto.b today and am currently working on if.b. So just keep an eye on the files. best, Angelo From f4grx at f4grx.net Thu Jun 15 18:33:53 2023 From: f4grx at f4grx.net (Sebastien F4GRX) Date: Thu, 15 Jun 2023 10:33:53 +0200 Subject: [TUHS] Software written in B In-Reply-To: References: <1e651370-3ada-e211-c277-409d6563500d@f4grx.net> <202306080331.3583Vrw7057546@ultimate.com> Message-ID: <21361f2d-eda9-fd68-09f0-ded89b16285b@f4grx.net> Hello again, OK for the absence of escapes. It's not a big deal. And thank you for the authorization. I will surely follow your progress. There is a typo in your goto.b decompilation, an AND operator is missing before testing for end of string, here is a patch: diff --git a/examples/squoze_goto.b b/examples/squoze_goto.b index bb39e66..8d48bb3 100644 --- a/examples/squoze_goto.b +++ b/examples/squoze_goto.b @@ -31,7 +31,7 @@ l:                 goto l;                 }         while ((ch=getchar())==' '); -       while (ch!=' ' & ch!='*n' ch!='*0') { +       while (ch!=' ' & ch!='*n' & ch!='*0') {                 lchar(s, i++, ch);                 ch = getchar();         } Sebastien Le 15/06/2023 à 10:21, Angelo Papenhoff a écrit : > On 15/06/23, Sebastien F4GRX wrote: >> I see that su.b contains non-ascii characters ($11,$12,$13,$14) which >> tripped my linux console when I attempted a copy paste, since $13 is >> XOFF :) Is there a way to escape these characters? Or update them, since >> it looks like a password provided on command line. > The last1120 C compiler does not have an escape mechanism for strings, > so it's highly unlikely the B compiler had it. > >> May I have your authorization for copying these files into my own b >> compiler repository ( https://git.sr.ht/~f4grx/bpars )? What licence and >> attribution info shall I indicate? > Sure, sure. I reversed goto.b today and am currently working on if.b. So > just keep an eye on the files. > > best, > Angelo From douglas.mcilroy at dartmouth.edu Fri Jun 16 12:18:08 2023 From: douglas.mcilroy at dartmouth.edu (Douglas McIlroy) Date: Thu, 15 Jun 2023 22:18:08 -0400 Subject: [TUHS] GNU eqn clarifications and reforms Message-ID: I am not convinced that using special characters rather than in-line eqn is a good thing. It means learning a whole new vocabulary. Quick, what's the special character for Greek psi? I have found that, for a sequence of displayed equations as in an algebraic derivation, a pile often looks more coherent than a sequence of EQ-EN pairs. The pile can even contain interleaved comments, as in Hoare-style proofs. Doug From g.branden.robinson at gmail.com Fri Jun 16 13:20:55 2023 From: g.branden.robinson at gmail.com (G. Branden Robinson) Date: Thu, 15 Jun 2023 22:20:55 -0500 Subject: [TUHS] GNU eqn clarifications and reforms In-Reply-To: References: Message-ID: <20230616032055.ixdg6ubvfxgjhsmb@illithid> [looping groff list back in] At 2023-06-15T22:18:08-0400, Douglas McIlroy wrote: > I am not convinced that using special characters rather than in-line > eqn is a good thing. It means learning a whole new vocabulary. Quick, > what's the special character for Greek psi? \[*q] ! But I may suffer from an excessive familiarity with this material. (Checking myself, I got it right! Part of my mnemonic is that there are 24 letters to map from our Latin alphabet to Greek, drop 'j' and 'v' as "Late Latin" variants of 'i' and 'u'[1], and then most of the rest map intuitively with a handful of exceptions that have to be memorized, psi being one of them.) > I have found that, for a sequence of displayed equations as in an > algebraic derivation, a pile often looks more coherent than a sequence > of EQ-EN pairs. The pile can even contain interleaved comments, as in > Hoare-style proofs. Yes. I suspect there is a widely held misconception that eqn distinguishes displayed equations from inline ones. It doesn't--a macro package might, but even then, nothing internal to the equation's typography is different. I guess this is a hangover from TeX? You need one rule: use "smallover" instead of "over" if you're trying to pack a fraction into running text. Regards, Branden [1] Which isn't _quite_ correct but works for this purpose. -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: not available URL: From g.branden.robinson at gmail.com Fri Jun 16 15:07:05 2023 From: g.branden.robinson at gmail.com (G. Branden Robinson) Date: Fri, 16 Jun 2023 00:07:05 -0500 Subject: [TUHS] GNU eqn clarifications and reforms In-Reply-To: <37ea7f2b-e8f7-3cfe-a27d-ece47c5dc0f7@esi.com.au> References: <20230616032055.ixdg6ubvfxgjhsmb@illithid> <37ea7f2b-e8f7-3cfe-a27d-ece47c5dc0f7@esi.com.au> Message-ID: <20230616050705.3ualzkwc44a7sm4u@illithid> At 2023-06-16T14:22:22+1000, Damian McGuckin wrote: > On Thu, 15 Jun 2023, G. Branden Robinson wrote: > > But I may suffer from an excessive familiarity with this material. > > Yes. Me too. Maybe that is a sad comment on the both of us. That is the price of trying to leave things better than one found them. > Why do Greeks have an alternate way of writing sigma! We English-speakers used to have an alternative way of writing it, if you regard the Latin alphabet's "S" as cognate (so to speak) with the Greek sigma (and I think doing so is defensible). It's even in Unicode with a low code point, U+017F. For inſtance, the United States uſed to employ a non-final lowercaſe S in the founding documents of its preſent government, where you can see exhibits of the "Congreſs of the United States". It can take the modern reader a "long S" time to not read that "s" as an "f". And if you think that's difficult enough, check out, IIRC, the Arabic and Devanagari scripts where you can have different initial, medial, and final forms for letters. Follow-ups ſhould probably be confined to groff@; I'll ſtop now leſt we get ſent to the COFF liſt for groſs tranſgreſſions of topicality. (Although if anyone wants to tell me whether non-final s was applied to the trailing ends of non-final morphemes _within_ words, I'm all ears.) Regards, Branden -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: not available URL: From wobblygong at gmail.com Fri Jun 16 19:39:27 2023 From: wobblygong at gmail.com (Wesley Parish) Date: Fri, 16 Jun 2023 21:39:27 +1200 Subject: [TUHS] end-S/long-S (was: Re: GNU eqn clarifications and reforms) In-Reply-To: <1qA46y-3OI-00@marmaro.de> References: <20230616032055.ixdg6ubvfxgjhsmb@illithid> <37ea7f2b-e8f7-3cfe-a27d-ece47c5dc0f7@esi.com.au> <20230616050705.3ualzkwc44a7sm4u@illithid> <1qA46y-3OI-00@marmaro.de> Message-ID: I ſaw Mommy kiſſing ſanta Klaus, underneath the miſtletoe laſt night ... re: compound words in English ending is "s" - bossman, bossmonkey, etc. Though in the case of "Godzone", a somewhat tongue-in-cheek name for New Zealand derived from "God's Own", a tribute to New Zealand's natural wonders, the "s" has been replaced by "z" ... Yes, it's an interesting feature of European alphabet development. Wesley Parish On 16/06/23 19:43, markus schnalke wrote: > Hoi. > > [2023-06-16 07:07] "G. Branden Robinson" >> For inſtance, the United States uſed to employ a non-final lowercaſe S >> in the founding documents of its preſent government, where you can see >> exhibits of the "Congreſs of the United States". > In old German, up to WWII, namely in Fraktur (the printed letters) > and Sütterlin (the handwritten letters) both kinds of S are > present. > > Today, the long-S has only survived in some old company and > restaurant names, many of them changing by and by to the end-S, > because younger Germans can't read long-S and don't understand it > anymore. Newer names don't use it the long-S, even if they are > written in Fraktur letters, which would demand for the long-S. > > For example the beer brand Warsteiner changed the long-S in 2013 to > the end-S. > https://1000logos.net/wp-content/uploads/2020/01/Warsteiner-Logo-history.png > > >> (Although if anyone wants to tell me whether non-final s was applied to >> the trailing ends of non-final morphemes _within_ words, I'm all ears.) > I'm no language expert, so I don't really know what morphemes are. > What I do know is that the round-S (i.e. end-S) is applied to the > end of words, parts of compound words (typical for German), and in > some situatuations even to parts of words. -- But only in Fraktur > and Sütterlin, not in modern German (latin alphabet), which does > no longer have a long-S. > > Examples: > > End of word: Haus (engl: house) > Middle of word: Kiſte (engl: box) > > Compound word: Hausmaus (engl: house mouse) > Hauſmaus would be wrong. > > In such cases the end-S is in the middle of the word. Such > compounds are typical for German. If you have an english word > like ``downhill'', where two separate words joined into one, > the end-S of the first part would still remain an end-S, > although it moved into the middle of the word. (Sorry, I > cannot find an english example where the first word part ends > with s.) > > There's a famous example for the difference the distinguishing > between s and ſ can make: > > Wachſtube, i.e. Wach-Stube (engl: guardhouse) > Wachstube, i.e. Wachs-Tube (engl: wax tube) > > In modern German context is necessary to know which meaning > of Wachstube is the right one, in old German it's clear from the > writing. > > Besides compounds German is also infamous for it's prefixes. If > you combine the prefix ``aus'' (engl: out) with other words, the > end-S remains as well: > > ausgezeichnet (engl: excellent -- wordly: out-marked) > Ausfahrt (engl: exit for vehicles -- wordly: out-drive) > > Using the long-S in these situation would be wrong. > > That means: Whenever one uses a word, that can stand alone (and is > thus well-known for it's shape), as part of a larger word, the part > stays the same, even within other words, keeping its end-S. > > > Generally I'd say, but take this only as a rule of thumb, because > I'm not enough expert in this: You use an end-S in all situation > where you would want to avoid a ligature of the s with the next > letter. Long-S can have ligatures with the following letter and > there are common ones in German. (In Wachſtube an st-ligature > would be preferred.) End-S will never have ligatures with the > following letter. > > > This at least is the situation concerning old German, as > understood by someone with curiousity for the topic but without > real lingual knowledge. > > > meillo From meillo at marmaro.de Fri Jun 16 17:43:12 2023 From: meillo at marmaro.de (markus schnalke) Date: Fri, 16 Jun 2023 09:43:12 +0200 Subject: [TUHS] end-S/long-S (was: Re: GNU eqn clarifications and reforms) In-Reply-To: <20230616050705.3ualzkwc44a7sm4u@illithid> References: <20230616032055.ixdg6ubvfxgjhsmb@illithid> <37ea7f2b-e8f7-3cfe-a27d-ece47c5dc0f7@esi.com.au> <20230616050705.3ualzkwc44a7sm4u@illithid> Message-ID: <1qA46y-3OI-00@marmaro.de> Hoi. [2023-06-16 07:07] "G. Branden Robinson" > > For inſtance, the United States uſed to employ a non-final lowercaſe S > in the founding documents of its preſent government, where you can see > exhibits of the "Congreſs of the United States". In old German, up to WWII, namely in Fraktur (the printed letters) and Sütterlin (the handwritten letters) both kinds of S are present. Today, the long-S has only survived in some old company and restaurant names, many of them changing by and by to the end-S, because younger Germans can't read long-S and don't understand it anymore. Newer names don't use it the long-S, even if they are written in Fraktur letters, which would demand for the long-S. For example the beer brand Warsteiner changed the long-S in 2013 to the end-S. https://1000logos.net/wp-content/uploads/2020/01/Warsteiner-Logo-history.png > (Although if anyone wants to tell me whether non-final s was applied to > the trailing ends of non-final morphemes _within_ words, I'm all ears.) I'm no language expert, so I don't really know what morphemes are. What I do know is that the round-S (i.e. end-S) is applied to the end of words, parts of compound words (typical for German), and in some situatuations even to parts of words. -- But only in Fraktur and Sütterlin, not in modern German (latin alphabet), which does no longer have a long-S. Examples: End of word: Haus (engl: house) Middle of word: Kiſte (engl: box) Compound word: Hausmaus (engl: house mouse) Hauſmaus would be wrong. In such cases the end-S is in the middle of the word. Such compounds are typical for German. If you have an english word like ``downhill'', where two separate words joined into one, the end-S of the first part would still remain an end-S, although it moved into the middle of the word. (Sorry, I cannot find an english example where the first word part ends with s.) There's a famous example for the difference the distinguishing between s and ſ can make: Wachſtube, i.e. Wach-Stube (engl: guardhouse) Wachstube, i.e. Wachs-Tube (engl: wax tube) In modern German context is necessary to know which meaning of Wachstube is the right one, in old German it's clear from the writing. Besides compounds German is also infamous for it's prefixes. If you combine the prefix ``aus'' (engl: out) with other words, the end-S remains as well: ausgezeichnet (engl: excellent -- wordly: out-marked) Ausfahrt (engl: exit for vehicles -- wordly: out-drive) Using the long-S in these situation would be wrong. That means: Whenever one uses a word, that can stand alone (and is thus well-known for it's shape), as part of a larger word, the part stays the same, even within other words, keeping its end-S. Generally I'd say, but take this only as a rule of thumb, because I'm not enough expert in this: You use an end-S in all situation where you would want to avoid a ligature of the s with the next letter. Long-S can have ligatures with the following letter and there are common ones in German. (In Wachſtube an st-ligature would be preferred.) End-S will never have ligatures with the following letter. This at least is the situation concerning old German, as understood by someone with curiousity for the topic but without real lingual knowledge. meillo From paul.winalski at gmail.com Sat Jun 17 02:18:29 2023 From: paul.winalski at gmail.com (Paul Winalski) Date: Fri, 16 Jun 2023 12:18:29 -0400 Subject: [TUHS] end-S/long-S (was: Re: GNU eqn clarifications and reforms) In-Reply-To: <1qA46y-3OI-00@marmaro.de> References: <20230616032055.ixdg6ubvfxgjhsmb@illithid> <37ea7f2b-e8f7-3cfe-a27d-ece47c5dc0f7@esi.com.au> <20230616050705.3ualzkwc44a7sm4u@illithid> <1qA46y-3OI-00@marmaro.de> Message-ID: On 6/16/23, markus schnalke wrote: > > [2023-06-16 07:07] "G. Branden Robinson" >> >> For inſtance, the United States uſed to employ a non-final lowercaſe S >> in the founding documents of its preſent government, where you can see >> exhibits of the "Congreſs of the United States". > > In old German, up to WWII, namely in Fraktur (the printed letters) > and Sütterlin (the handwritten letters) both kinds of S are > present. > > Today, the long-S has only survived in some old company and > restaurant names, many of them changing by and by to the end-S, > because younger Germans can't read long-S and don't understand it > anymore. German also has a ligature letter called eszet that is a fusion of a long s (the one that resembles the English letter f) and a short s. It is used when a 's' sound is immediately preceded by a long vowel or a diphthong and not followed by a consonant. When the glyph for eszet isn't available 'ss' is substituted, as in the word 'strasse' (street). -Paul W. From cowan at ccil.org Sat Jun 17 03:46:16 2023 From: cowan at ccil.org (John Cowan) Date: Fri, 16 Jun 2023 13:46:16 -0400 Subject: [TUHS] end-S/long-S (was: Re: GNU eqn clarifications and reforms) In-Reply-To: References: <20230616032055.ixdg6ubvfxgjhsmb@illithid> <37ea7f2b-e8f7-3cfe-a27d-ece47c5dc0f7@esi.com.au> <20230616050705.3ualzkwc44a7sm4u@illithid> <1qA46y-3OI-00@marmaro.de> Message-ID: On Fri, Jun 16, 2023 at 12:18 PM Paul Winalski wrote: > German also has a ligature letter called eszet that is a fusion of a > long s (the one that resembles the English letter f) and a short s. > Not a short s, but a z, as the name indicates: es-zett, S-Z. This reflects the use of z in Old and Middle High German to represent a sibilant sound distinct from s, derived from /t/ by the High German sound shift but distinct from original /s/. When the distinction was lost in the 13C, z came to be used for its modern sound /ts/, but the ligature came to represent the merged /s/. -------------- next part -------------- An HTML attachment was scrubbed... URL: From usotsuki at buric.co Sat Jun 17 03:51:18 2023 From: usotsuki at buric.co (Steve Nickolas) Date: Fri, 16 Jun 2023 13:51:18 -0400 (EDT) Subject: [TUHS] end-S/long-S (was: Re: GNU eqn clarifications and reforms) In-Reply-To: References: <20230616032055.ixdg6ubvfxgjhsmb@illithid> <37ea7f2b-e8f7-3cfe-a27d-ece47c5dc0f7@esi.com.au> <20230616050705.3ualzkwc44a7sm4u@illithid> <1qA46y-3OI-00@marmaro.de> Message-ID: On Fri, 16 Jun 2023, John Cowan wrote: > On Fri, Jun 16, 2023 at 12:18 PM Paul Winalski > wrote: > > >> German also has a ligature letter called eszet that is a fusion of a >> long s (the one that resembles the English letter f) and a short s. >> > > Not a short s, but a z, as the name indicates: es-zett, S-Z. This > reflects the use of z in Old and Middle High German to represent a sibilant > sound distinct from s, derived from /t/ by the High German sound shift but > distinct from original /s/. When the distinction was lost in the 13C, z > came to be used for its modern sound /ts/, but the ligature came to > represent the merged /s/. I've seen ß used in some copies of the Geneva Bible with exactly the modern German sense, as a ligature of long s and normal s. -uso. From aap at papnet.eu Sat Jun 17 04:19:58 2023 From: aap at papnet.eu (Angelo Papenhoff) Date: Fri, 16 Jun 2023 20:19:58 +0200 Subject: [TUHS] end-S/long-S (was: Re: GNU eqn clarifications and reforms) In-Reply-To: References: <20230616032055.ixdg6ubvfxgjhsmb@illithid> <37ea7f2b-e8f7-3cfe-a27d-ece47c5dc0f7@esi.com.au> <20230616050705.3ualzkwc44a7sm4u@illithid> <1qA46y-3OI-00@marmaro.de> Message-ID: On 16/06/23, Steve Nickolas wrote: > I've seen ß used in some copies of the Geneva Bible with exactly the > modern German sense, as a ligature of long s and normal s. There are two origins of this character. One ſs and one ſʒ. You can see it here: https://upload.wikimedia.org/wikipedia/commons/0/0e/Sz_modern.svg 1 and 2 are ſs, 3 and 4 are ſʒ Now it gets even more off-topic (sorry): My personal favourite (and the one I've adopted in my handwriting) is the one used in Berlin street signs, obviously ſʒ: https://upload.wikimedia.org/wikipedia/commons/3/32/Strasse-FF-Cst-Berlin.png Interestingly I've noticed the Bonn street signs look rather similar, maybe it's a thing capitals do. aap From g.branden.robinson at gmail.com Sat Jun 17 05:38:57 2023 From: g.branden.robinson at gmail.com (G. Branden Robinson) Date: Fri, 16 Jun 2023 14:38:57 -0500 Subject: [TUHS] GNU eqn clarifications and reforms In-Reply-To: <1da83cf3-7fac-b5fd-dec-e4b4313d3a@esi.com.au> References: <20230616032055.ixdg6ubvfxgjhsmb@illithid> <37ea7f2b-e8f7-3cfe-a27d-ece47c5dc0f7@esi.com.au> <20230616050705.3ualzkwc44a7sm4u@illithid> <1qA46y-3OI-00@marmaro.de> <1da83cf3-7fac-b5fd-dec-e4b4313d3a@esi.com.au> Message-ID: <20230616193857.tvtjhddovjpwfegb@illithid> At 2023-06-17T05:19:46+1000, Damian McGuckin wrote: > Getting back to groff, that final/terminating sigma, is it still > pronounced as sigma. > > It certainly has no EQN equivalent name and its groff short symbol > name is > > \(ts > > (terminal sigma) which is not like other greek letters. Just > wondering whether it needs a sentence to mention its abscence from > EQN. There are a few others, but they postdate Ossanna troff. From groff_char(7) in 1.23.0.rc4: ϵ \[+e] u03F5 variant epsilon (lunate) ϑ \[+h] u03D1 variant theta (cursive form) ϖ \[+p] u03D6 variant pi (similar to omega) φ \[+f] u03C6 variant phi (curly shape) ς \[ts] u03C2 terminal lowercase sigma + I know of no reason to make these generally available by default in eqn, though, any more than they already are. You can type their special characters in eqn input and assign spacing and style types to them. (This typing system is a GNU eqn feature, not present in AT&T eqn). In fact I have a coupled pair of reforms in mind for GNU eqn: unfastening the definitions of the lowercase Greek special characters from the typeface used for letters (variables), and then defining the lowercase Greek letter eqn macro names ("alpha", "beta", ...) to explicitly use the "letter" style type. https://savannah.gnu.org/bugs/?64232 https://savannah.gnu.org/bugs/?64231 (For example: define alpha ! type "letter" \(*a ! ) This should result in no change for historical documents (except on terminals, where it will fix a bug), and give us some flexibility for users of modern fonts where Greek letters are properly supported in text fonts (i.e., in four styles). The Graphic Systems C/A/T had uppercase Greek available _only_ upright and lowercase Greek _only_ italic. Modern typesetting systems are not so limited. Regards, Branden -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: not available URL: From tuhs at tuhs.org Sat Jun 17 05:52:50 2023 From: tuhs at tuhs.org (segaloco via TUHS) Date: Fri, 16 Jun 2023 19:52:50 +0000 Subject: [TUHS] WECo UNIX 3B5 User's Manual Found Message-ID: Good day, I've just received in the mail a UNIX System User Reference Manual for the 3B5 computer. It has a few differences with other documentation around it. As usual with an initial expository message, lots of info here, mostly so it'll get in the archive and on the record. So first off, I can't find a version reference in this thing. It is branded as "UNIX System" consistent with the branding nomenclature in the System V era, but I can't actually find the term "System V" anywhere thus far. However, a high level view implies relative parity with the initial release of System V. There are some areas where nomenclature is closer in character to the Release 5.0 manual, for instance, the "basinf" section in the intro refers to the user guide as "UNIX User Guide" rather than "UNIX System User's Guide" which is found in later System V stuff. In fact, this minor reference may point to some branching in the documentation between 4.x and SVR2 as I have the following in various manuals (prior to 4.1 i.e. 3.0/SysIII the reference is still directly to UNIX For Beginners, not a guide): - Release 4.1 - "UNIX User's Guide" - Release 5.0 - "UNIX User's Guide" - Release 5.0 BTL - "UNIX User's Guide" - System V - "UNIX System User's Guide" - System V 3B5 - "UNIX User Guide" - System V R2 BTL - "UNIX System User Guide" - System V DEC (1984) - "UNIX System User Guide" - System V R2 (1986 Manuals) - "UNIX System User's Guide" So SysV adds "System" to what was in Release 5.0, this carries through to conventional SVR2. The 3B5 version, however, drops the apostrophe and 's' that were in the pre-SysV nomenclature but doesn't add "System". Then even more confusing the SVR2 BTL copy appears to bear some lineage from this as it also has the dropped apostrophe and 's' but includes "System". Strange. Even stranger is I decided to take a peek at SVR2 docs from 1984, they lack the apostrophe and 's', but later SVR2 material from 1986 restores it. I wonder if this implies the 3B5 branch was started in the 5.0 days, diverged a bit, and then was only partially recombined with System V before release, although on the flip side, this manual *does* include the edit, ex, and vi manpages which were not printed in time for the System V manual run (as they are included with a separate documentation package instead.) This tracks with the BTL 5.0 having edit, ex, vi, and termcap present from Holmdel, the BTL manual got the pages early. All that to say, there are things in this manual that aren't yet in published System V manuals at the time, but there are things in this manual that have since been altered by the time of the formal System V documentation, pointing to an earlier branch point and then ongoing cross-talk after that. Included are references to a "3B Computer Network" and a few utilities associated. There are a few other pages too I didn't see in other contemporary public manuals, in total: - dcon(1) - Spawns a shell on a remote system via a DATAKIT circuit - logdir(1) - Returns the home directory field from /etc/passwd, this is in the BTL versions, I don't see it in public SysV though - ncp(1) - Copies files over the DATAKIT network - nisend(1) - Copies files over the "3B Computer Local Network" - nistat(1) - Query the status of said network - nitable(1) - Display the configuration table of said network - niupdate(1) - Update said configuration table - nkill(1) - Kill but using process names instead of IDs, but doesn't define process names, be it argv[0], the name of the image file, etc... - rexec(1) - Executes commands over a DATAKIT network - rl(1) - Login remotely over the 3B Computer Network (distinct from dcon being DATAKIT remote logins) This appears to be uucp-derived (specifically cu(1)) - dkdial(3) - Dials a DATAKIT connection - boothdr(4) - 3B5 only, provides the contents of which supports storing parts of master(4), via mkboot(1M), in "a driver object file" to be used with "the self-config boot". Section 6 is mentioned in the intro but then omitted from the rest of the manual, so nothing to compare there. Also keeping with the documentation changes at the time, this does not include Sections 1M, 7, nor 8, as those are presumably in an accompanying Administrator's Reference Manual. That is another thing pegging this as System V rather than SVR2, by SVR2 they had further divided from two to three manuals, splitting the user manual into Sections 1 and 6 (User) and Sections 2, 3, 4, and 5 (Programmer) (although even this isn't entirely true, I've got a "UNIX User's Manual" published in 1986, red ATTIS-style cover, that contains what appear to be selections from Sections 1, 2, and 3...it seems more geared towards folks writing portable software between SVR1 and SVR2 than anything) Finally, here are the omissions I compared with the SVR2 BTL, SVR2 DEC, and 1986 manual mentioned above: Removed by SVR2 public, only in the BTL version: - nscstat(1) - nsctorje(1) - nusend(1) - stlogin(1) - ststat(1) Non-portable DEC stuff: - adb(1) - arcv(1) - kasb(1) - net(1) - vpr(1) - maus(2) - x25alnk(3) - This X.25 stuff never shows back up, probably dropped as of SVR2 BTL (1983) - x25clnk(3) - x25hlnk(3) - x25ipvc(3) Non-portable 3B20S stuff: - cprs(1) - hpio(1) Honeywell/GCOS Interop, gone by SVR2 BTL (1983): - dpd(1) - dpr(1) - fget(1) - fsend(1) - gcat(1) - gcosmail(1) Graphics Subsystem, remains in SVR2 so probably not 3B5 supported as of this printing: - gdev(1) - ged(1) - graphics(1) - gutil(1) - stat(1) - toc(1) So just to review, some matters this manual supports: - The initial 3B5 UNIX release seems closest in character to the initial System V version - Many DEC and 3B20-specific components are omitted - The Honeywell/GCOS interop was on the way out the door and likely never ported - The graphics subsystem was not supported on 3B5 as of this release - Synchronous terminals and NSC networking are taken internal likely by this release, certainly by SVR2 - The 3B5 version supported "DATAKIT" and "3B Computer Network" networks - Included a logdir(1) command used in BTL for getting a user's login directory from /etc/passwd - Included an nkill(1) command to kill a process by its (undefined) name - The boot process included a header object for "driver object files" used with a "self-config boot" process If there are any questions or any pages folks think I should peruse for details, just let me know. Otherwise this'll won't be hitting my detailed analysis for a while, I'm currently in the midst of figuring out a branching scheme in my mandiff repo that'll facilitate tracking the various forks, as I've found many changes between V5 and V6 that are *not* reflected in various ways throughout PWB, Program Generic, CB, and 32V (as an example, go look up where lpr(I) is and isn't available.) - Matt G. P.S. Kudos to the production quality of this manual. It's a small binder, the pages are the same size as the earlier comb-bound manuals. The binder rings themselves are fixed to the back cover and the right side of the rings is flat instead of rounded, so the pages sit very nicely whether opened or closed. This compares with the BTL SVR2 binder where the rings are perfectly round and affixed to the spine instead, so they sit differently depending on whether the binder is on a shelf or open on a desk, with pages risking getting all crumpled up getting bunched up at the edge of the rings. Certainly has nothing to do with software or technical history, but the physical nature of the various publications has also been factoring into my study. Here's a picture of the two covers by the way, since I haven't given any visuals on my work in a while: https://i.imgur.com/hhaaxfA.jpeg From aap at papnet.eu Sat Jun 17 18:19:29 2023 From: aap at papnet.eu (Angelo Papenhoff) Date: Sat, 17 Jun 2023 10:19:29 +0200 Subject: [TUHS] Software written in B In-Reply-To: References: <1e651370-3ada-e211-c277-409d6563500d@f4grx.net> <202306080331.3583Vrw7057546@ultimate.com> Message-ID: Update: I'm now done with the first pass of this. I reversed all the programs and successfully ran them through my compiler (i haven't assembled or linked anything though). http://squoze.net/B/programs/ To check for correctness, the files should of course be compiled, assembled and linked again. Unfortunately my compiler currently does not generate quite the same code as the original one. I will have to work on this. Most importantly & and | are only bitwise operators in the version of B that compiled these programs, but some other differences (like the fixup chain and the way strings are stored) exist too. It would be nice to have a fully working B system on v1/v2 UNIX again, with everything built from source, we can even reconstruct different versions of the runtime (and perhaps standard library). So far the PDP-11 version of my B system has only run on v6 and 2.11BSD. best, aap On 14/06/23, Angelo Papenhoff wrote: > I will hopefully continue with this in the next time (if, goto, mail and > glob are left). From douglas.mcilroy at dartmouth.edu Sun Jun 18 01:53:00 2023 From: douglas.mcilroy at dartmouth.edu (Douglas McIlroy) Date: Sat, 17 Jun 2023 11:53:00 -0400 Subject: [TUHS] undiagnosed pic error In-Reply-To: References: Message-ID: It's fairly sparse, e.g. "at" is compatible with most everything except "from" and "to". Setup might look like this: long int compat[] = { [HAS_AT] = HAS_FROM | HAS_TO, On Wed, Jun 14, 2023 at 6:42 AM Marc Donner wrote: > > How sparse is the 35x35 matrix? For comprehensibility would it be the best way to do it? > > On Tue, Jun 13, 2023 at 9:59 PM Douglas McIlroy wrote: >> >> There may be a simple generic way to correct pic's habit of accepting >> any set of object modifiers on any object, but obeying only a >> compatible subset. >> >> Pic already collects a bit vector of modifier types attached to the >> current object. If that were extended with a few more bits that >> designate the object types, the size, B, of the bit vector would be >> about 35--an easy fit in one 64-bit word. Then a BxB bit matrix could >> record both modifier/modifier incompatibilities and object/modifier >> incompatibilities. The collected bit vector needs to be tested against >> the matrix once per object definition. >> >> It seems to be harder to catch duplication of modifiers, requiring >> extra code at all points where bits are set. Nevertheless, this kind >> of error also merits detection. >> >> Some questions >> >> Does anybody think the issue is not worth addressing? >> >> Is there a better scheme than that suggested above? >> >> Is the scheme adequate? It would not, for example, catch a three-way >> incompatibility that does not entail any pairwise incompatibility, >> should such an incompatibility exist. >> >> Any other thoughts? >> >> Doug > > -- > ===== > nygeek.net > mindthegapdialogs.com/home From douglas.mcilroy at dartmouth.edu Sun Jun 18 01:59:37 2023 From: douglas.mcilroy at dartmouth.edu (Douglas McIlroy) Date: Sat, 17 Jun 2023 11:59:37 -0400 Subject: [TUHS] undiagnosed pic error In-Reply-To: References: Message-ID: Google claims I just sent another unintended reply, this time unfinished. Apologies, Doug On Wed, Jun 14, 2023 at 6:42 AM Marc Donner wrote: > > How sparse is the 35x35 matrix? For comprehensibility would it be the best way to do it? > > On Tue, Jun 13, 2023 at 9:59 PM Douglas McIlroy wrote: >> >> There may be a simple generic way to correct pic's habit of accepting >> any set of object modifiers on any object, but obeying only a >> compatible subset. >> >> Pic already collects a bit vector of modifier types attached to the >> current object. If that were extended with a few more bits that >> designate the object types, the size, B, of the bit vector would be >> about 35--an easy fit in one 64-bit word. Then a BxB bit matrix could >> record both modifier/modifier incompatibilities and object/modifier >> incompatibilities. The collected bit vector needs to be tested against >> the matrix once per object definition. >> >> It seems to be harder to catch duplication of modifiers, requiring >> extra code at all points where bits are set. Nevertheless, this kind >> of error also merits detection. >> >> Some questions >> >> Does anybody think the issue is not worth addressing? >> >> Is there a better scheme than that suggested above? >> >> Is the scheme adequate? It would not, for example, catch a three-way >> incompatibility that does not entail any pairwise incompatibility, >> should such an incompatibility exist. >> >> Any other thoughts? >> >> Doug > > -- > ===== > nygeek.net > mindthegapdialogs.com/home From tuhs at tuhs.org Sun Jun 18 05:37:55 2023 From: tuhs at tuhs.org (segaloco via TUHS) Date: Sat, 17 Jun 2023 19:37:55 +0000 Subject: [TUHS] 4.3BSD User Contributed Software Volume? Message-ID: Hello, I've come across something in a bookshelf up at the local university that I have thus far been unsuccessful at locating online. It's a binder amongst other 4.3BSD binders like the reference manuals and supplementary documents but this one is labeled "User Contributed Software (UCS)". I can't find any of these in the doc folder of the 4.3BSD copies in the archive nor can I find a scan of the originals. There is an overview at the start listing the following packages: B, X, ansi (VAX tape tools), apl, bib, courier, cpm, dipress, dsh, emacs, enet, help, hyper, icon, jove, kermit, mh, mkmf, mmdf, news, notes, npl00, patch, path alias, rcs, rn, spms, sumacc, sunrpc, tac, tools, umodem, and xns. There are a preponderance of man pages as well as some focused papers for B, SPMS, the Icon programming language, and MMDFII. I didn't scribble down more notes before heading home because I figured this was something I'd find on page one of an internet search, but thus far have had no luck. If I don't turn up a digital copy soon I might just have to make another trip up there soon with my flatbed scanner. In any case I left a note too hoping whoever the curator of that bookshelf is (it was in some club room) might have the scoop on those binders, if they're left over from some long gone 4.3BSD installation in the EE/CS department and if they might have some siblings in other bookshelves nearby. - Matt G. From reed at reedmedia.net Sun Jun 18 06:23:37 2023 From: reed at reedmedia.net (Jeremy C. Reed) Date: Sat, 17 Jun 2023 20:23:37 +0000 (UTC) Subject: [TUHS] 4.3BSD User Contributed Software Volume? In-Reply-To: References: Message-ID: On Sat, 17 Jun 2023, segaloco via TUHS wrote: > It's a binder amongst other 4.3BSD binders like the reference manuals > and supplementary documents but this one is labeled "User Contributed > Software (UCS)". I can't find any of these in the doc folder of the > 4.3BSD copies in the archive nor can I find a scan of the originals. > There is an overview at the start listing the following packages: B, > X, ansi (VAX tape tools), apl, bib, courier, cpm, dipress, dsh, emacs, > enet, help, hyper, icon, jove, kermit, mh, mkmf, mmdf, news, notes, > npl00, patch, path alias, rcs, rn, spms, sumacc, sunrpc, tac, tools, > umodem, and xns. Is this the same as the docs found in 4.3bsd /new 4.3BSD-Tahoe /new 4.3BSD-Reno /src/share/doc/ucs csrg-archives/disk1/mnt/4.3/usr/contrib csrg-archives/disk2/mnt/4.3reno/usr/src/share/doc/ucs/ csrg-archives/disk2/mnt/4.3tahoe/usr/src/new/ I found docs for kermit, umodem, sumacc, and others. From reed at reedmedia.net Sun Jun 18 06:37:39 2023 From: reed at reedmedia.net (Jeremy C. Reed) Date: Sat, 17 Jun 2023 20:37:39 +0000 (UTC) Subject: [TUHS] 4.3BSD User Contributed Software Volume? In-Reply-To: References: Message-ID: On Sat, 17 Jun 2023, Jeremy C. Reed wrote: > On Sat, 17 Jun 2023, segaloco via TUHS wrote: > > It's a binder amongst other 4.3BSD binders like the reference manuals > > and supplementary documents but this one is labeled "User Contributed > > Software (UCS)". I can't find any of these in the doc folder of the > > 4.3BSD copies in the archive nor can I find a scan of the originals. > > There is an overview at the start listing the following packages: B, > > X, ansi (VAX tape tools), apl, bib, courier, cpm, dipress, dsh, emacs, > > enet, help, hyper, icon, jove, kermit, mh, mkmf, mmdf, news, notes, > > npl00, patch, path alias, rcs, rn, spms, sumacc, sunrpc, tac, tools, > > umodem, and xns. Also see 4.3BSD-Reno src/share/doc/ucs/Cover csrg-archives/disk2 4.3reno/usr/src/share/doc/ucs/Cover .TL User Contributed Software .LP The subtree /usr/src/new contains programs contributed by the user community. The following software is included: .DS .TS center, box; l l l. Directory Description Contributor(s) _ ... The Makefile above it says: # index, iso, and ucs aren't done yet From tuhs at tuhs.org Sun Jun 18 06:48:09 2023 From: tuhs at tuhs.org (segaloco via TUHS) Date: Sat, 17 Jun 2023 20:48:09 +0000 Subject: [TUHS] 4.3BSD User Contributed Software Volume? In-Reply-To: References: Message-ID: <4BE6zOmhi8qtP3BlKusXuLdZ97CR8m8LYWuGG3__BZEVYo1Nx28UJ1PlIT0u967pT5-cTGJF_vqmiK6uIvf0F6re-MEhuu08Hyv5PqaApCg=@protonmail.com> Gah, that's the one. I was just looking in /usr/doc and /usr/share/doc, didn't realize there were also copies down in the source tree with different contents. Okay cool so this is accounted for, good, thanks for pointing that out Jeremy! I just need to start grepping instead of doing a shallow search in the interface, I have all these sources in a local stash anyway, I need to put that cache to better use. - Matt G. ------- Original Message ------- On Saturday, June 17th, 2023 at 1:37 PM, Jeremy C. Reed wrote: > On Sat, 17 Jun 2023, Jeremy C. Reed wrote: > > > On Sat, 17 Jun 2023, segaloco via TUHS wrote: > > > > > It's a binder amongst other 4.3BSD binders like the reference manuals > > > and supplementary documents but this one is labeled "User Contributed > > > Software (UCS)". I can't find any of these in the doc folder of the > > > 4.3BSD copies in the archive nor can I find a scan of the originals. > > > There is an overview at the start listing the following packages: B, > > > X, ansi (VAX tape tools), apl, bib, courier, cpm, dipress, dsh, emacs, > > > enet, help, hyper, icon, jove, kermit, mh, mkmf, mmdf, news, notes, > > > npl00, patch, path alias, rcs, rn, spms, sumacc, sunrpc, tac, tools, > > > umodem, and xns. > > > Also see > 4.3BSD-Reno src/share/doc/ucs/Cover > > csrg-archives/disk2 4.3reno/usr/src/share/doc/ucs/Cover > > .TL > User Contributed Software > .LP > The subtree /usr/src/new contains programs contributed by the user > community. The following software is included: > .DS > .TS > center, box; > l l l. > Directory Description Contributor(s) > _ > ... > > > The Makefile above it says: # index, iso, and ucs aren't done yet From kennethgoodwin56 at gmail.com Sun Jun 18 10:11:09 2023 From: kennethgoodwin56 at gmail.com (Kenneth Goodwin) Date: Sat, 17 Jun 2023 20:11:09 -0400 Subject: [TUHS] undiagnosed pic error In-Reply-To: References: Message-ID: You did. You forgot the trailing } Syntax error See previous email... On Sat, Jun 17, 2023, 12:00 PM Douglas McIlroy < douglas.mcilroy at dartmouth.edu> wrote: > Google claims I just sent another unintended reply, this time unfinished. > > Apologies, > Doug > > On Wed, Jun 14, 2023 at 6:42 AM Marc Donner wrote: > > > > How sparse is the 35x35 matrix? For comprehensibility would it be the > best way to do it? > > > > On Tue, Jun 13, 2023 at 9:59 PM Douglas McIlroy < > douglas.mcilroy at dartmouth.edu> wrote: > >> > >> There may be a simple generic way to correct pic's habit of accepting > >> any set of object modifiers on any object, but obeying only a > >> compatible subset. > >> > >> Pic already collects a bit vector of modifier types attached to the > >> current object. If that were extended with a few more bits that > >> designate the object types, the size, B, of the bit vector would be > >> about 35--an easy fit in one 64-bit word. Then a BxB bit matrix could > >> record both modifier/modifier incompatibilities and object/modifier > >> incompatibilities. The collected bit vector needs to be tested against > >> the matrix once per object definition. > >> > >> It seems to be harder to catch duplication of modifiers, requiring > >> extra code at all points where bits are set. Nevertheless, this kind > >> of error also merits detection. > >> > >> Some questions > >> > >> Does anybody think the issue is not worth addressing? > >> > >> Is there a better scheme than that suggested above? > >> > >> Is the scheme adequate? It would not, for example, catch a three-way > >> incompatibility that does not entail any pairwise incompatibility, > >> should such an incompatibility exist. > >> > >> Any other thoughts? > >> > >> Doug > > > > -- > > ===== > > nygeek.net > > mindthegapdialogs.com/home > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jsg at jsg.id.au Sun Jun 18 11:07:25 2023 From: jsg at jsg.id.au (Jonathan Gray) Date: Sun, 18 Jun 2023 11:07:25 +1000 Subject: [TUHS] 4.3BSD User Contributed Software Volume? In-Reply-To: References: Message-ID: On Sat, Jun 17, 2023 at 07:37:55PM +0000, segaloco via TUHS wrote: > Hello, I've come across something in a bookshelf up at the local > university that I have thus far been unsuccessful at locating online. > > It's a binder amongst other 4.3BSD binders like the reference manuals > and supplementary documents but this one is labeled "User Contributed > Software (UCS)". I can't find any of these in the doc folder of > the 4.3BSD copies in the archive nor can I find a scan of the > originals. There is an overview at the start listing the following > packages: B, X, ansi (VAX tape tools), apl, bib, courier, cpm, > dipress, dsh, emacs, enet, help, hyper, icon, jove, kermit, mh, > mkmf, mmdf, news, notes, npl00, patch, path alias, rcs, rn, spms, > sumacc, sunrpc, tac, tools, umodem, and xns. Al Kossow scanned the Mt Xinu version including that page https://archive.org/details/bitsavers_mtXinuMTXI_25102566/page/n1/mode/2up https://bitsavers.org/pdf/mtXinu/MT_XINU_UCS_Apr_1986.pdf > > There are a preponderance of man pages as well as some focused > papers for B, SPMS, the Icon programming language, and MMDFII. I > didn't scribble down more notes before heading home because I figured > this was something I'd find on page one of an internet search, but > thus far have had no luck. If I don't turn up a digital copy soon > I might just have to make another trip up there soon with my flatbed > scanner. In any case I left a note too hoping whoever the curator > of that bookshelf is (it was in some club room) might have the scoop > on those binders, if they're left over from some long gone 4.3BSD > installation in the EE/CS department and if they might have some > siblings in other bookshelves nearby. > > - Matt G. From tuhs at tuhs.org Sun Jun 18 12:26:34 2023 From: tuhs at tuhs.org (segaloco via TUHS) Date: Sun, 18 Jun 2023 02:26:34 +0000 Subject: [TUHS] 1981 Bell System Brochure and Price List Message-ID: Spent some time this afternoon scanning and included a couple of those 1981 marketing documents I relayed some prices from in a previous email: https://archive.org/details/bell-system-software-brochure-1981 https://archive.org/details/bell-system-commercial-software-fees-1981 The cover of the former came out a bit rough. The stock photo is very fine stippling and my scanner just couldn't cut it. In any case, the first is a series of one-page summaries of various software offerings by the Bell System and the second is a price sheet not unlike https://www.bell-labs.com/usr/dmr/www/licenses/pricelist84.pdf but earlier. The larger brochure also includes software offerings for Honeywell and IBM systems as well as some portable components. - Matt G. From f4grx at f4grx.net Mon Jun 19 19:52:40 2023 From: f4grx at f4grx.net (Sebastien F4GRX) Date: Mon, 19 Jun 2023 11:52:40 +0200 Subject: [TUHS] Software written in B In-Reply-To: References: <1e651370-3ada-e211-c277-409d6563500d@f4grx.net> <202306080331.3583Vrw7057546@ultimate.com> Message-ID: Hello, my own compiler choked on if.b and mail.b because of this invalid expression in nxtarg: if(ap>ac) return(0*ap++); I believe it should instead read: if(ap>ac) return 0; What do you think about this? PS: The original code gives this AST which shows how my compiler interprets this source : function name nxtarg .  env: ref to function nxtarg .  env: ref to function main .  compound stmt .  .  env: extern ap .  .  env: extern ac .  .  env: extern argv .  .  env: ref to function nxtarg .  .  env: ref to function main .  .  if test .  .  .  operation: GT .  .  .  .  extern declaration ap .  .  .  .  extern declaration ac .  .  then stmt .  .  .  return stmt .  .  .  .  operation: MUL .  .  .  .  .  value: 0 .  .  .  .  .  operation: POSTINC .  .  .  .  .  .  extern declaration ap .  .  return stmt .  .  .  [LVALUE] operation: INDEX .  .  .  .  extern declaration argv .  .  .  .  operation: POSTINC .  .  .  .  .  extern declaration ap Sebastien Le 17/06/2023 à 10:19, Angelo Papenhoff a écrit : > Update: I'm now done with the first pass of this. > I reversed all the programs and successfully ran them through my > compiler (i haven't assembled or linked anything though). > http://squoze.net/B/programs/ > > To check for correctness, the files should of course be compiled, > assembled and linked again. Unfortunately my compiler currently > does not generate quite the same code as the original one. I will > have to work on this. > Most importantly & and | are only bitwise operators in the version > of B that compiled these programs, but some other differences (like > the fixup chain and the way strings are stored) exist too. > > It would be nice to have a fully working B system on v1/v2 UNIX again, > with everything built from source, we can even reconstruct different > versions of the runtime (and perhaps standard library). So far the > PDP-11 version of my B system has only run on v6 and 2.11BSD. > > best, > aap > > On 14/06/23, Angelo Papenhoff wrote: >> I will hopefully continue with this in the next time (if, goto, mail and >> glob are left). From f4grx at f4grx.net Mon Jun 19 20:18:45 2023 From: f4grx at f4grx.net (Sebastien F4GRX) Date: Mon, 19 Jun 2023 12:18:45 +0200 Subject: [TUHS] Software written in B In-Reply-To: References: <1e651370-3ada-e211-c277-409d6563500d@f4grx.net> <202306080331.3583Vrw7057546@ultimate.com> Message-ID: Hi, Sorry for my previous message, my analysis is wrong. The return value is still a typo, but this expression should not pose a problem, the AST shows that the postinc has priority, uses a legit lvalue, which result is multiplied by zero. This was probably not intended but syntactically correct. I have added better error identification and now I see that my problem is in exp(s) nextarg()[0] is refused. because nextarg() is not a lvalue, as expected by the indexing operation. I need to update my code generator so that any expression left of an index is accepted (and used as an address). Sebastien Le 19/06/2023 à 11:52, Sebastien F4GRX a écrit : > Hello, > > > my own compiler choked on if.b and mail.b because of this invalid > expression in nxtarg: > > if(ap>ac) return(0*ap++); > > > I believe it should instead read: > > if(ap>ac) return 0; > > > What do you think about this? > > > PS: The original code gives this AST which shows how my compiler > interprets this source : > > function name nxtarg > .  env: ref to function nxtarg > .  env: ref to function main > .  compound stmt > .  .  env: extern ap > .  .  env: extern ac > .  .  env: extern argv > .  .  env: ref to function nxtarg > .  .  env: ref to function main > .  .  if test > .  .  .  operation: GT > .  .  .  .  extern declaration ap > .  .  .  .  extern declaration ac > .  .  then stmt > .  .  .  return stmt > .  .  .  .  operation: MUL > .  .  .  .  .  value: 0 > .  .  .  .  .  operation: POSTINC > .  .  .  .  .  .  extern declaration ap > .  .  return stmt > .  .  .  [LVALUE] operation: INDEX > .  .  .  .  extern declaration argv > .  .  .  .  operation: POSTINC > .  .  .  .  .  extern declaration ap > > Sebastien > > > Le 17/06/2023 à 10:19, Angelo Papenhoff a écrit : >> Update: I'm now done with the first pass of this. >> I reversed all the programs and successfully ran them through my >> compiler (i haven't assembled or linked anything though). >> http://squoze.net/B/programs/ >> >> To check for correctness, the files should of course be compiled, >> assembled and linked again. Unfortunately my compiler currently >> does not generate quite the same code as the original one. I will >> have to work on this. >> Most importantly & and | are only bitwise operators in the version >> of B that compiled these programs, but some other differences (like >> the fixup chain and the way strings are stored) exist too. >> >> It would be nice to have a fully working B system on v1/v2 UNIX again, >> with everything built from source, we can even reconstruct different >> versions of the runtime (and perhaps standard library). So far the >> PDP-11 version of my B system has only run on v6 and 2.11BSD. >> >> best, >> aap >> >> On 14/06/23, Angelo Papenhoff wrote: >>> I will hopefully continue with this in the next time (if, goto, mail >>> and >>> glob are left). From lars at nocrew.org Mon Jun 19 20:48:27 2023 From: lars at nocrew.org (Lars Brinkhoff) Date: Mon, 19 Jun 2023 10:48:27 +0000 Subject: [TUHS] Software written in B In-Reply-To: (Sebastien F4GRX's message of "Mon, 19 Jun 2023 12:18:45 +0200") References: <1e651370-3ada-e211-c277-409d6563500d@f4grx.net> <202306080331.3583Vrw7057546@ultimate.com> Message-ID: <7wcz1r20ec.fsf@junk.nocrew.org> Sebastien F4GRX wrote: >> my own compiler choked on if.b and mail.b because of this invalid >> expression in nxtarg: >> >> if(ap>ac) return(0*ap++); > > The return value is still a typo, but this expression should not pose > a problem, the AST shows that the postinc has priority, uses a legit > lvalue, which result is multiplied by zero. > > This was probably not intended but syntactically correct. I could well believe it's not a typo, but a "clever" way to write if(ap>ac) { ap++; return 0; } From g.branden.robinson at gmail.com Mon Jun 19 20:55:49 2023 From: g.branden.robinson at gmail.com (G. Branden Robinson) Date: Mon, 19 Jun 2023 05:55:49 -0500 Subject: [TUHS] Software written in B In-Reply-To: <7wcz1r20ec.fsf@junk.nocrew.org> References: <202306080331.3583Vrw7057546@ultimate.com> <7wcz1r20ec.fsf@junk.nocrew.org> Message-ID: <20230619105549.nd55fq4uayyxbwnf@illithid> At 2023-06-19T10:48:27+0000, Lars Brinkhoff wrote: > Sebastien F4GRX wrote: > >> if(ap>ac) return(0*ap++); > > > > This was probably not intended but syntactically correct. > > I could well believe it's not a typo, but a "clever" way to write > > if(ap>ac) { > ap++; > return 0; > } /* You were not expected to understand that. */ Regards, Branden -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: not available URL: From f4grx at f4grx.net Mon Jun 19 21:07:58 2023 From: f4grx at f4grx.net (Sebastien F4GRX) Date: Mon, 19 Jun 2023 13:07:58 +0200 Subject: [TUHS] Software written in B In-Reply-To: <20230619105549.nd55fq4uayyxbwnf@illithid> References: <202306080331.3583Vrw7057546@ultimate.com> <7wcz1r20ec.fsf@junk.nocrew.org> <20230619105549.nd55fq4uayyxbwnf@illithid> Message-ID: <3e603b07-1707-1782-3ffd-1aa4aa888d56@f4grx.net> yeah, I see that's exactly right :-) I lack a lot of context implied by these source codes. Compacity was quite important. This is very interesting, and very useful to test my compiler, as I have guessed ! Sebastien Le 19/06/2023 à 12:55, G. Branden Robinson a écrit : > At 2023-06-19T10:48:27+0000, Lars Brinkhoff wrote: >> Sebastien F4GRX wrote: >>>> if(ap>ac) return(0*ap++); >>> This was probably not intended but syntactically correct. >> I could well believe it's not a typo, but a "clever" way to write >> >> if(ap>ac) { >> ap++; >> return 0; >> } > /* You were not expected to understand that. */ > > Regards, > Branden From tuhs at tuhs.org Tue Jun 20 04:44:19 2023 From: tuhs at tuhs.org (segaloco via TUHS) Date: Mon, 19 Jun 2023 18:44:19 +0000 Subject: [TUHS] Software written in B In-Reply-To: <3e603b07-1707-1782-3ffd-1aa4aa888d56@f4grx.net> References: <7wcz1r20ec.fsf@junk.nocrew.org> <20230619105549.nd55fq4uayyxbwnf@illithid> <3e603b07-1707-1782-3ffd-1aa4aa888d56@f4grx.net> Message-ID: > This is very interesting, and very useful to test my compiler, as I have > guessed ! > > Sebastien That's wonderful news. I'm glad I was able to point you in the right direction (and that I didn't have to reverse any B code with my luddite-level understanding of it...) Team work makes the dream work! - Matt G. From aap at papnet.eu Fri Jun 23 20:59:02 2023 From: aap at papnet.eu (Angelo Papenhoff) Date: Fri, 23 Jun 2023 12:59:02 +0200 Subject: [TUHS] Software written in B In-Reply-To: References: <1e651370-3ada-e211-c277-409d6563500d@f4grx.net> <202306080331.3583Vrw7057546@ultimate.com> Message-ID: Another update: I have worked quite a bit on my compiler the last couple of days and have managed to make it produce the same code as those in the binaries (compared the intermediate code by eye). There were a few mistakes, the files have been updated. I removed some features that I figured were not in ken's original compiler and most importantly changed it to generate RPN code directly instead of parsing an expression into a tree and generating code from that, which ken confirmed was how it worked. I'm still not entirely happy with the result (the build() function seems a bit kludgy), but at least it seems to produce something accurate. Unfortunately I have no idea what the intermediate would have looked like, so the output of this is purely my fantasy. I will also have to adjust my take on ba now due to the changes. For anyone interested, this is my WIP version of bc.b: https://gist.github.com/aap/6df9b4c53c63592437d97dadab533649 aap On 17/06/23, Angelo Papenhoff wrote: > Update: I'm now done with the first pass of this. > I reversed all the programs and successfully ran them through my > compiler (i haven't assembled or linked anything though). > http://squoze.net/B/programs/ > > To check for correctness, the files should of course be compiled, > assembled and linked again. Unfortunately my compiler currently > does not generate quite the same code as the original one. I will > have to work on this. > Most importantly & and | are only bitwise operators in the version > of B that compiled these programs, but some other differences (like > the fixup chain and the way strings are stored) exist too. > > It would be nice to have a fully working B system on v1/v2 UNIX again, > with everything built from source, we can even reconstruct different > versions of the runtime (and perhaps standard library). So far the > PDP-11 version of my B system has only run on v6 and 2.11BSD. > > best, > aap > > On 14/06/23, Angelo Papenhoff wrote: > > I will hopefully continue with this in the next time (if, goto, mail and > > glob are left). From f4grx at f4grx.net Fri Jun 23 23:32:53 2023 From: f4grx at f4grx.net (Sebastien F4GRX) Date: Fri, 23 Jun 2023 15:32:53 +0200 Subject: [TUHS] Software written in B In-Reply-To: References: <1e651370-3ada-e211-c277-409d6563500d@f4grx.net> <202306080331.3583Vrw7057546@ultimate.com> Message-ID: Hi, Wow, this is amazing. For my information, where are the initialization values for the ctab[128] vector coming from? Do you still have the version that generates a tree before codegen? I know that generating code while parsing is possible for stack based machines, this is also how Crenshaw describes it in its document (but for 68k). I decided not to do it for my own compiler, so that I can write multiple code generators. Sebastien Le 23/06/2023 à 12:59, Angelo Papenhoff a écrit : > Another update: I have worked quite a bit on my compiler > the last couple of days and have managed to make it produce the same > code as those in the binaries (compared the intermediate code by eye). > There were a few mistakes, the files have been updated. > > I removed some features that I figured were not in ken's original > compiler and most importantly changed it to generate RPN code directly > instead of parsing an expression into a tree and generating code from > that, which ken confirmed was how it worked. > I'm still not entirely happy with the result (the build() function seems > a bit kludgy), but at least it seems to produce something accurate. > Unfortunately I have no idea what the intermediate would have looked > like, so the output of this is purely my fantasy. > I will also have to adjust my take on ba now due to the changes. > > For anyone interested, this is my WIP version of bc.b: > https://gist.github.com/aap/6df9b4c53c63592437d97dadab533649 > > aap > > On 17/06/23, Angelo Papenhoff wrote: >> Update: I'm now done with the first pass of this. >> I reversed all the programs and successfully ran them through my >> compiler (i haven't assembled or linked anything though). >> http://squoze.net/B/programs/ >> >> To check for correctness, the files should of course be compiled, >> assembled and linked again. Unfortunately my compiler currently >> does not generate quite the same code as the original one. I will >> have to work on this. >> Most importantly & and | are only bitwise operators in the version >> of B that compiled these programs, but some other differences (like >> the fixup chain and the way strings are stored) exist too. >> >> It would be nice to have a fully working B system on v1/v2 UNIX again, >> with everything built from source, we can even reconstruct different >> versions of the runtime (and perhaps standard library). So far the >> PDP-11 version of my B system has only run on v6 and 2.11BSD. >> >> best, >> aap >> >> On 14/06/23, Angelo Papenhoff wrote: >>> I will hopefully continue with this in the next time (if, goto, mail and >>> glob are left). From aap at papnet.eu Sat Jun 24 00:01:21 2023 From: aap at papnet.eu (Angelo Papenhoff) Date: Fri, 23 Jun 2023 16:01:21 +0200 Subject: [TUHS] Software written in B In-Reply-To: References: <1e651370-3ada-e211-c277-409d6563500d@f4grx.net> <202306080331.3583Vrw7057546@ultimate.com> Message-ID: On 23/06/23, Sebastien F4GRX wrote: > For my information, where are the initialization values for the > ctab[128] vector coming from? Probably some mix between last1120c and what made sense to me. Actually found a bug in my compiler related to this, will have to check my earlier (tree) compiler for this too. > Do you still have the version that generates a tree before codegen? https://github.com/aap/b/blob/master/bc.b > I know that generating code while parsing is possible for stack based > machines, this is also how Crenshaw describes it in its document (but > for 68k). What I found somewhat difficult and why i opted for the tree-approach was mainly handling lvalues (but also conditional operators get a bit easier). If you have 'x = y', the compiler is supposed to generate va; 1 / lval of x a; 2 / rval of y b1 / = but when you pop the = it's already too late to change the 'a' to a 'va'. So my solution was to remember one operator of output so i can combine it with the next one. I'm still not entirely happy with it (the build() function especially), but it doesn't seem entirely wrong and does generate matching B code. Still, if people have suggestions how to do this better, I'd love to get some feedback on this. aap From f4grx at f4grx.net Sat Jun 24 00:10:11 2023 From: f4grx at f4grx.net (Sebastien F4GRX) Date: Fri, 23 Jun 2023 16:10:11 +0200 Subject: [TUHS] Software written in B In-Reply-To: References: <1e651370-3ada-e211-c277-409d6563500d@f4grx.net> <202306080331.3583Vrw7057546@ultimate.com> Message-ID: Hi again, OK I found your compiler project on github: https://github.com/aap/b I see you have introduced changes like this one: -       auto ab[500], ava[150]; -       auto dirf, string, av, name[5], s; +       auto ab 1000, ava 150; +       auto dirf, string, av, name 5, s; So it mean that at some point the auto declaration was changed? Oh, yes it was. Here is the original syntax : https://www.bell-labs.com/usr/dmr/www/kbman.html And this one describes the use of quotes for auto arrays, but thats for the H6070! https://www.bell-labs.com/usr/dmr/www/bref.pdf Also in bref, auto a[4] reserves 5 words, but in ken's version auto a 4 reserves 4 words. I think I will make a command line option to support both language specs :) And fix my code, because I had found this example https://github.com/Spydr06/BCause/blob/main/examples/fibonacci.b and was convinced that constants in auto were initial values, but I now understand that this is incorrect. Initialized auto variables would have been an avoidable syntactic sugar on these machines! Thanks, Sebastien Le 23/06/2023 à 12:59, Angelo Papenhoff a écrit : > Another update: I have worked quite a bit on my compiler > the last couple of days and have managed to make it produce the same > code as those in the binaries (compared the intermediate code by eye). > There were a few mistakes, the files have been updated. > > I removed some features that I figured were not in ken's original > compiler and most importantly changed it to generate RPN code directly > instead of parsing an expression into a tree and generating code from > that, which ken confirmed was how it worked. > I'm still not entirely happy with the result (the build() function seems > a bit kludgy), but at least it seems to produce something accurate. > Unfortunately I have no idea what the intermediate would have looked > like, so the output of this is purely my fantasy. > I will also have to adjust my take on ba now due to the changes. > > For anyone interested, this is my WIP version of bc.b: > https://gist.github.com/aap/6df9b4c53c63592437d97dadab533649 > > aap > > On 17/06/23, Angelo Papenhoff wrote: >> Update: I'm now done with the first pass of this. >> I reversed all the programs and successfully ran them through my >> compiler (i haven't assembled or linked anything though). >> http://squoze.net/B/programs/ >> >> To check for correctness, the files should of course be compiled, >> assembled and linked again. Unfortunately my compiler currently >> does not generate quite the same code as the original one. I will >> have to work on this. >> Most importantly & and | are only bitwise operators in the version >> of B that compiled these programs, but some other differences (like >> the fixup chain and the way strings are stored) exist too. >> >> It would be nice to have a fully working B system on v1/v2 UNIX again, >> with everything built from source, we can even reconstruct different >> versions of the runtime (and perhaps standard library). So far the >> PDP-11 version of my B system has only run on v6 and 2.11BSD. >> >> best, >> aap >> >> On 14/06/23, Angelo Papenhoff wrote: >>> I will hopefully continue with this in the next time (if, goto, mail and >>> glob are left). From f4grx at f4grx.net Sat Jun 24 00:14:50 2023 From: f4grx at f4grx.net (Sebastien F4GRX) Date: Fri, 23 Jun 2023 16:14:50 +0200 Subject: [TUHS] Software written in B In-Reply-To: References: <1e651370-3ada-e211-c277-409d6563500d@f4grx.net> <202306080331.3583Vrw7057546@ultimate.com> Message-ID: <8590534a-e3ef-9532-41b0-11d38e32f177@f4grx.net> Hello, Yes, I found your code. Le 23/06/2023 à 16:01, Angelo Papenhoff a écrit : > Probably some mix between last1120c and what made sense to me. > Actually found a bug in my compiler related to this, will have to check > my earlier (tree) compiler for this too. It looks like these definitions are in https://github.com/aap/b/blob/master/b.h Do you pre-process your B source files? Sebastien From aap at papnet.eu Sat Jun 24 00:39:02 2023 From: aap at papnet.eu (Angelo Papenhoff) Date: Fri, 23 Jun 2023 16:39:02 +0200 Subject: [TUHS] Software written in B In-Reply-To: <8590534a-e3ef-9532-41b0-11d38e32f177@f4grx.net> References: <202306080331.3583Vrw7057546@ultimate.com> <8590534a-e3ef-9532-41b0-11d38e32f177@f4grx.net> Message-ID: This is for the compiler i wrote in C, for bootstrapping. I do use symbolic constants in the B code too, but there i'm just using sed as a preprocessor. On 23/06/23, Sebastien F4GRX wrote: > It looks like these definitions are in > https://github.com/aap/b/blob/master/b.h > > Do you pre-process your B source files? > > Sebastien > From aap at papnet.eu Sat Jun 24 00:49:29 2023 From: aap at papnet.eu (Angelo Papenhoff) Date: Fri, 23 Jun 2023 16:49:29 +0200 Subject: [TUHS] Software written in B In-Reply-To: References: <1e651370-3ada-e211-c277-409d6563500d@f4grx.net> <202306080331.3583Vrw7057546@ultimate.com> Message-ID: I also stumbled over these language differences. In my earlier compiler I tried to support all syntax that looked reasonable, but now I tried to make the code more accurate to the actual thing. I don't know much about the 6070 version of B and scj didn't remember many technical details when I asked him about it a few years ago. For auto vectors you can find an actual example here: https://github.com/DoctorWkt/pdp7-unix/blob/master/src/cmd/ind.b#L3 Ken said that the language was the same on the pdp-7 and pdp-11. The runtime we have for both are different but this must simply reflect different evolutionary stages rather than platform differences. I haven't gotten around to building my new compiler yet, but it would be interesting to see if it's small enough to fit into the 4kw of pdp-7 userspace. The runtime would have to be updated and expanded of course. In any case it should be fun to run it on v1/v2. Have to try that on my 11/05 :) It sure is interesting how many compilers have popped up for B over the years. I also have a strangely strong love for this little language. aap On 23/06/23, Sebastien F4GRX wrote: > Hi again, > > OK I found your compiler project on github: https://github.com/aap/b > > I see you have introduced changes like this one: > > -       auto ab[500], ava[150]; > -       auto dirf, string, av, name[5], s; > +       auto ab 1000, ava 150; > +       auto dirf, string, av, name 5, s; > > So it mean that at some point the auto declaration was changed? > > Oh, yes it was. Here is the original syntax : > https://www.bell-labs.com/usr/dmr/www/kbman.html > > And this one describes the use of quotes for auto arrays, but thats for > the H6070! https://www.bell-labs.com/usr/dmr/www/bref.pdf > > Also in bref, auto a[4] reserves 5 words, but in ken's version auto a 4 > reserves 4 words. > > I think I will make a command line option to support both language specs :) > > And fix my code, because I had found this example > https://github.com/Spydr06/BCause/blob/main/examples/fibonacci.b > > and was convinced that constants in auto were initial values, but I now > understand that this is incorrect. > > Initialized auto variables would have been an avoidable syntactic sugar > on these machines! From f4grx at f4grx.net Sat Jun 24 01:31:52 2023 From: f4grx at f4grx.net (Sebastien F4GRX) Date: Fri, 23 Jun 2023 17:31:52 +0200 Subject: [TUHS] Software written in B In-Reply-To: References: <1e651370-3ada-e211-c277-409d6563500d@f4grx.net> <202306080331.3583Vrw7057546@ultimate.com> Message-ID: <6273eb21-e529-4227-0e59-d9f496871fd1@f4grx.net> Hi, I just modified my compiler to support both syntaxes, that was easy. I am not trying to make my compiler as small as possible. The 6070 version also has "default:" in the switch statement and "break" to exit compound statements, of which I have just implemented the second for now. Thats completely backwards compatible with the early version. These are just fancy goto and labels. The ind.b program you shared must have been even earlier, using $( $) instead of braces and no parentheses on function names. With the price of storage these guys were really minimalists! I can imagine Ken not being happy about empty parentheses for functions without parameters :-) Would you mind to share your sed script to preprocess your B B compiler? Thats the only thing missing for me to try compiling it. Sebastien Le 23/06/2023 à 16:49, Angelo Papenhoff a écrit : > I also stumbled over these language differences. > In my earlier compiler I tried to support all syntax that looked > reasonable, but now I tried to make the code more accurate to the actual > thing. I don't know much about the 6070 version of B and scj didn't > remember many technical details when I asked him about it a few years ago. > > For auto vectors you can find an actual example here: > https://github.com/DoctorWkt/pdp7-unix/blob/master/src/cmd/ind.b#L3 > > Ken said that the language was the same on the pdp-7 and pdp-11. The > runtime we have for both are different but this must simply reflect > different evolutionary stages rather than platform differences. > I haven't gotten around to building my new compiler yet, but it would be > interesting to see if it's small enough to fit into the 4kw of pdp-7 > userspace. The runtime would have to be updated and expanded of course. > > In any case it should be fun to run it on v1/v2. Have to try that > on my 11/05 :) > > It sure is interesting how many compilers have popped up for B over the > years. I also have a strangely strong love for this little language. > > aap > > > On 23/06/23, Sebastien F4GRX wrote: >> Hi again, >> >> OK I found your compiler project on github: https://github.com/aap/b >> >> I see you have introduced changes like this one: >> >> -       auto ab[500], ava[150]; >> -       auto dirf, string, av, name[5], s; >> +       auto ab 1000, ava 150; >> +       auto dirf, string, av, name 5, s; >> >> So it mean that at some point the auto declaration was changed? >> >> Oh, yes it was. Here is the original syntax : >> https://www.bell-labs.com/usr/dmr/www/kbman.html >> >> And this one describes the use of quotes for auto arrays, but thats for >> the H6070! https://www.bell-labs.com/usr/dmr/www/bref.pdf >> >> Also in bref, auto a[4] reserves 5 words, but in ken's version auto a 4 >> reserves 4 words. >> >> I think I will make a command line option to support both language specs :) >> >> And fix my code, because I had found this example >> https://github.com/Spydr06/BCause/blob/main/examples/fibonacci.b >> >> and was convinced that constants in auto were initial values, but I now >> understand that this is incorrect. >> >> Initialized auto variables would have been an avoidable syntactic sugar >> on these machines! From aap at papnet.eu Sat Jun 24 01:36:08 2023 From: aap at papnet.eu (Angelo Papenhoff) Date: Fri, 23 Jun 2023 17:36:08 +0200 Subject: [TUHS] Software written in B In-Reply-To: <6273eb21-e529-4227-0e59-d9f496871fd1@f4grx.net> References: <202306080331.3583Vrw7057546@ultimate.com> <6273eb21-e529-4227-0e59-d9f496871fd1@f4grx.net> Message-ID: On 23/06/23, Sebastien F4GRX wrote: > Would you mind to share your sed script to preprocess your B B compiler? > Thats the only thing missing for me to try compiling it. Still same as my original compiler, so e.g. https://github.com/aap/b/blob/master/pdp11/preproc.sh Only the first few lines are platform dependent. aap From f4grx at f4grx.net Sat Jun 24 01:53:35 2023 From: f4grx at f4grx.net (Sebastien F4GRX) Date: Fri, 23 Jun 2023 17:53:35 +0200 Subject: [TUHS] Software written in B In-Reply-To: References: <202306080331.3583Vrw7057546@ultimate.com> <6273eb21-e529-4227-0e59-d9f496871fd1@f4grx.net> Message-ID: <3bc04212-9ee6-761c-df59-4197cd1a6088@f4grx.net> Thanks, I missed that one. My test driver program is now able to apply this preprocessing before running the compiler, but I forgot to implement parsing of global vector initializers, so i have to do that before :-) Sebastien Le 23/06/2023 à 17:36, Angelo Papenhoff a écrit : > On 23/06/23, Sebastien F4GRX wrote: >> Would you mind to share your sed script to preprocess your B B compiler? >> Thats the only thing missing for me to try compiling it. > Still same as my original compiler, so e.g. > https://github.com/aap/b/blob/master/pdp11/preproc.sh > Only the first few lines are platform dependent. > > > aap From noel.hunt at gmail.com Sat Jun 24 12:33:48 2023 From: noel.hunt at gmail.com (Noel Hunt) Date: Sat, 24 Jun 2023 12:33:48 +1000 Subject: [TUHS] C Btrees Message-ID: There is a little known suite of programs, written by Peter Weinberger, found as 'btree', or 'cbt', in the archives for Eighth and Tenth Edition. The code in the Eighth Edition archive seems to be the earliest, and has fewer utilities than available in the Tenth Edition code. A search through files shows that it was used by 'road', 'weather' and 'apnews'. There is an ms file, 'memo', describing the programs, amongst the code, but an appendix seems to be missing. If anyone knows about this or where it might be I'd like to get my hands on it. 'Memo' itself is interesting because it's the only troff document I've seen amongst the reseach papers (excluding Christopher Van Wyk's own paper of course) that uses 'ideal', in this case, for drawing a picture depicting B-tree structure. From douglas.mcilroy at dartmouth.edu Sat Jun 24 22:49:26 2023 From: douglas.mcilroy at dartmouth.edu (Douglas McIlroy) Date: Sat, 24 Jun 2023 08:49:26 -0400 Subject: [TUHS] C Btrees In-Reply-To: References: Message-ID: I used Ideal to make most of the figures in "Getting raster ellipses right", the first paper in CSTR #155. That paper grew out of a simple request from Rob Pike for an ellipse-drawing primitive for the Blit. Doug On Fri, Jun 23, 2023 at 10:34 PM Noel Hunt wrote: > > There is a little known suite of programs, written by Peter Weinberger, > found as 'btree', or 'cbt', in the archives for Eighth and Tenth > Edition. > > The code in the Eighth Edition archive seems to be the earliest, and > has fewer utilities than available in the Tenth Edition code. A search > through files shows that it was used by 'road', 'weather' and > 'apnews'. > > There is an ms file, 'memo', describing the programs, amongst the code, > but an appendix seems to be missing. If anyone knows about this or > where it might be I'd like to get my hands on it. > > 'Memo' itself is interesting because it's the only troff document I've > seen amongst the reseach papers (excluding Christopher Van Wyk's own > paper of course) that uses 'ideal', in this case, for drawing a > picture depicting B-tree structure. From noel.hunt at gmail.com Mon Jun 26 07:11:08 2023 From: noel.hunt at gmail.com (Noel Hunt) Date: Mon, 26 Jun 2023 07:11:08 +1000 Subject: [TUHS] C Btrees In-Reply-To: References: Message-ID: I see, thanks for that information. I was aware of that paper's existence but I have never read it. I think it was on the old Bell Labs website there was a collection of those CSTRs but I can't seem to find them now. On Sat, 24 Jun 2023 at 22:49, Douglas McIlroy wrote: > > I used Ideal to make most of the figures in "Getting raster ellipses > right", the first paper in CSTR #155. That paper grew out of a simple > request from Rob Pike for an ellipse-drawing primitive for the Blit. > > Doug > > On Fri, Jun 23, 2023 at 10:34 PM Noel Hunt wrote: > > > > There is a little known suite of programs, written by Peter Weinberger, > > found as 'btree', or 'cbt', in the archives for Eighth and Tenth > > Edition. > > > > The code in the Eighth Edition archive seems to be the earliest, and > > has fewer utilities than available in the Tenth Edition code. A search > > through files shows that it was used by 'road', 'weather' and > > 'apnews'. > > > > There is an ms file, 'memo', describing the programs, amongst the code, > > but an appendix seems to be missing. If anyone knows about this or > > where it might be I'd like to get my hands on it. > > > > 'Memo' itself is interesting because it's the only troff document I've > > seen amongst the reseach papers (excluding Christopher Van Wyk's own > > paper of course) that uses 'ideal', in this case, for drawing a > > picture depicting B-tree structure. From tuhs at tuhs.org Mon Jun 26 11:08:33 2023 From: tuhs at tuhs.org (segaloco via TUHS) Date: Mon, 26 Jun 2023 01:08:33 +0000 Subject: [TUHS] Late 70's UNIX Stream Differences from V6 Message-ID: <0J591JqnItu6DwugGGVq7vd-7vVdrZ4qlsYhQ6Ezqnav8w_U2Zc1c7V5lNg3m-PZKXB7Q8LMYsdT-FLTTJkKSiONTIdTpCbwf9V_wQ3-Jm4=@protonmail.com> Hello, I'd like to share some analysis from my recent Sixth Edition pass of my mandiff repository. For the V5-V6 diff, I opted for a branching approach, starting with a last universal common ancestor (which isn't quite right [1]). I compared each set of changes with the MERT0, PWB 1.0, CB-UNIX 2.3, 32V, and to a lesser extent V7 and System III manuals attempting to suss out the spiderweb of changes between them all. I created a series of branches representing last common changes between groups of branches as well. This has resulted in a littering of merge commits in the repository, but a banana's gonna have a peel. A few important points about document genealogy here: - The MERT0 manual, in the introduction, denotes descent from the USG Program Generic 3 manual. Furthermore, there is a listing of which pages would be replaced, which also serves as a key to which pages should be PG3 original text. However, the hs(IV) and ht(IV) pages make reference to specific MERT pages, so I question the veracity of this list. In any case, for the purposes of this analysis, much may extrapolate to USG PG3 as well. More study is needed. - The CB-UNIX manual currently available is Edition 2.3. In studying the numbering system for CB, I've found that this represents Release 2, Issue 3, as in the kernel there are references to releases, not editions. The clue is in one of the manpages somewhere, I don't recollect as of this typing where, but that'll come back around soon enough. The manual itself appears to be from a binder that was once a CB-UNIX 2.1 binder and had select pages replaced. There are some bits and pieces of 2.1 pages that were otherwise slated to be replaced, alluding to things like the /etc/lines file in common with PG3. In any case I've prioritized 2.1 changes over 2.3 changes where they can be determined, but like PG3, no complete picture can be determined of 2.1 from available documentation. For each of the branches, the following number of files in total reflect V5-V6 changes which aren't incorporated: - 32V: 7 - PWB: 15 - MERT0: 46 Of these MERT0 has the greatest number of items lacking research's upstream changes from late '74-early '75. Among them: - Has a V5-ish bas(I), no rc(I) (ratfor) at all - The group system is not present, newgrp(I), group(V), chgrp(VIII), etc. are nowhere to be found - nice(I) has no priority argument, simply sets a "low priority" - TTYs are still referred to as "teletypes" instead of "typewriters" in many places - there are 10 TTYs max so many commands don't reflect adjustments for two-digit IDs (ps(I) in particular is quite different, very V5) - retains the lpr print command (which shows up again in 32V and System III) - additionally, according to the replacement page list, PG3 retained the fed and form editing programs - Program Generic may not have had a man(I) page, as the one here is a MERT0 addition, hard to say CB tragically needs to be remerged, found as I was typing this up the system call section got an errant merge with V6 changes that shouldn't be there. Needless to say there is much in section II of the CB manual that leans more V5-ish than V6-ish. PWB differs in minor ways. The differences can be found in this list https://gitlab.com/segaloco/mandiff/-/merge_requests?scope=all&state=closed Each of the obviously labeled, closed merges represents a snapshot diff of the particular branch in question. As stated, the CB branch currently is in dire straits, I'm going to work that up again sometime in the future, but I should be able to use this to produce diff-able reproductions of the MERT0 and CB-UNIX 2.3 manual sources for this repository, as well as any other materials that may pop up. - Matt G. [1] - This pass I did not take good notes on such matters, but there are a few pages I'll anecdotally say reflect contents predating V5 sprinkled amongst the various manuals. I will consult with previous diffs when questions arise on in-depth analysis of the non-research changes in the branches. In any case the historical record already confirms CB-UNIX at the very least branched off quite early. From arnold at skeeve.com Mon Jun 26 16:46:12 2023 From: arnold at skeeve.com (arnold at skeeve.com) Date: Mon, 26 Jun 2023 00:46:12 -0600 Subject: [TUHS] CSTRs [was Re: Re: C Btrees] In-Reply-To: References: Message-ID: <202306260646.35Q6kCqb018458@freefriends.org> I have a bunch of the CSTRs that I downloaded back when they were still there. Warren -- do you want them for the archive? Thanks, Arnold Noel Hunt wrote: > I see, thanks for that information. I was aware of that paper's > existence but I have never read it. I think it was on the old > Bell Labs website there was a collection of those CSTRs but > I can't seem to find them now. > > On Sat, 24 Jun 2023 at 22:49, Douglas McIlroy > wrote: > > > > I used Ideal to make most of the figures in "Getting raster ellipses > > right", the first paper in CSTR #155. That paper grew out of a simple > > request from Rob Pike for an ellipse-drawing primitive for the Blit. > > > > Doug > > > > On Fri, Jun 23, 2023 at 10:34 PM Noel Hunt wrote: > > > > > > There is a little known suite of programs, written by Peter Weinberger, > > > found as 'btree', or 'cbt', in the archives for Eighth and Tenth > > > Edition. > > > > > > The code in the Eighth Edition archive seems to be the earliest, and > > > has fewer utilities than available in the Tenth Edition code. A search > > > through files shows that it was used by 'road', 'weather' and > > > 'apnews'. > > > > > > There is an ms file, 'memo', describing the programs, amongst the code, > > > but an appendix seems to be missing. If anyone knows about this or > > > where it might be I'd like to get my hands on it. > > > > > > 'Memo' itself is interesting because it's the only troff document I've > > > seen amongst the reseach papers (excluding Christopher Van Wyk's own > > > paper of course) that uses 'ideal', in this case, for drawing a > > > picture depicting B-tree structure. From arnold at skeeve.com Wed Jun 28 16:26:02 2023 From: arnold at skeeve.com (Aharon Robbins) Date: Wed, 28 Jun 2023 09:26:02 +0300 Subject: [TUHS] Trying to date "A Supplemental Document For Awk" Message-ID: Hi All. Attached is "A Supplemental Document For Awk". This circulated on USENET in the 80s. My copy is dated January 18, 1989, but I'm sure it's older than that. One clue is the reference to the 4.2 BSD manual, and 4.3 came out already in 1986 or so. Does anyone else have a copy of this with perhaps an older date? As far as I can tell from a short search, the author is no longer living. If someone knows better and can provide contact info for him, that'd be great. In the meantime, Warren, do you want to add it to the archives? Thanks! Arnold -------------- next part -------------- .RP .TL .B A Supplemental Document For AWK .sp .R - or - .sp .I Things Al, Pete, And Brian Didn't Mention Much .R .AU John W. Pierce .AI Department of Chemistry University of California, San Diego La Jolla, California 92093 jwp%chem at sdcsvax.ucsd.edu .AB As .B awk and its documentation are distributed with .I 4.2 BSD UNIX* .R there are a number of bugs, undocumented features, and features that are touched on so briefly in the documentation that the casual user may not realize their full significance. While this document applies primarily to the \fI4.2 BSD\fR version of \fIUNIX\fR, it is known that the \fI4.3 BSD\fR version does not have all of the bugs fixed, and that it does not have updated documentation. The situation with respect to the versions of \fBawk\fR disitributed with other versions \fIUNIX\fR and similar systems is unknown to the author. .FS *UNIX is a trademark of AT&T .FE .AE .LP In this document references to "the user manual" mean .I Awk - A Pattern Scanning and Processing Language (Second Edition) .R by Aho, Kernighan, and Weinberger. References to "awk(1)" mean the entry for .B awk in the .I UNIX Programmer's Manual, 4th Berkeley Distribution. .R References to "the documentation" mean both of those. .LP In most examples, the outermost set of braces ('{ }') have been ommitted. They would, of course, be necessary in real scripts. .NH Known Bugs .LP There are three main bugs known to me. They involve: .IP Assignment to input fields. .IP Piping output to a program from within an \fBawk\fR script. .IP Using '*' in \fIprintf\fR field width and precision specifications. .NH 2 Assignment to Input Fields .LP [This problem is partially fixed in \fI4.3BSD\fR; see the last paragraph of this section regarding the unfixed portion.] .LP The user manual states that input fields may be objects of assignment statements. Given the input line .DS field_one field_two field_three .DE the script .DS $2 = "new_field_2" print $0 .DE should print .DS field_one new_field_2 field_three .DE .LP This does not work; it will print .DS field_one field_two field_three .DE That is, the script will behave as if the assignment to $2 had not been made. However, explicitly referencing an "assigned to" field .I does recognize that the assignment has been made. If the script .DS $2 = "new_field_2" print $1, $2, $3 .DE is given the same input it will [properly] print .DS field_one new_field_2 field_three .DE Therefore, you can get around this bug with, e.g., .DS $2 = "new_field_2" output = $1 # Concatenate output fields for(i = 2; i <= NF; ++i) # into a single output line output = output OFS $i # with OFS between fields print output .DE .LP In \fI4.3BSD\fR, this bug has been fixed to the extent that the failing example above works correctly. However, a script like .DS $2 = "new_field_2" var = $0 print var .DE still gives incorrect output. This problem can be bypassed by using .DS \fIvar\fR = sprintf("%s", $0) .DE instead of "\fIvar\fR = $0"; \fIvar\fR will have the correct value. .NH 2 Piping Output to a Program .LP [This problem appears to have been fixed in \fI4.3BSD\fR, but that has not been exhaustively tested.] .LP The user manual states that .I print and .I printf statements may write to a program using, e.g., .DS print | "\fIcommand\fR" .DE This would pipe the output into \fIcommand\fR, and it does work. However, you should be aware that this causes .B awk to spawn a child process (\fIcommand\fR), and that it .I does not .R wait for the child to exit before it exits itself. In the case of a "slow" command like .B sort, .B awk may exit before .I command has finished. .LP This can cause problems in, for example, a shell script that depends on everything done by .B awk being finished before the next shell command is executed. Consider the shell script .DS awk -f awk_script input_file mv sorted_output somewhere_else .DE and the .B awk script .DS print output_line | "sort -o sorted_output" .DE If .I input_file is large .B awk will exit long before .B sort is finished. That means that the .B mv command will be executed before .B sort is finished, and the result is unlikely to be what you wanted. Other than fixing the source, there is no way to avoid this problem except to handle such pipes outside of the awk script, e.g. .DS awk -f awk_file input_file | sort -o sorted_output mv sorted_output somewhere_else .DE which is not wholly satisfactory. .LP See .I Sketchily Documented Features .R below for other considerations in redirecting output from within an .B awk script. .NH 2 Printf Field Width and Precision Specification With '*' .LP The document says that the \fIprintf\fR function provided is identical to the \fIprintf\fR provided by the \fIC\fR language \fBstdio\fR package. This is not true for the case of using '*' to specify a field width or precision. The command .DS printf("%*.s", len, string) .DE will cause a core dump. Given \fBawk\fR's age, it is likely that its \fIprintf\fR was written well before the use of '*' for specifying field width and precision appeared in the \fBstdio\fR library's \fIprintf\fR. Another possibility is that it wasn't implemented because it isn't really needed to achieve the same effect. .LP To accomplish this effect, you can utilize the fact that \fBawk\fR concatenates variables before it does any other processing on them. For example, assume a script has two variables \fIwid\fR and \fIprec\fR which control the width and precision used for printing another variable \fIval\fI: .DS [code to set "wid", "prec", and "val"] printf("%" wid "." prec "d\en", val) .DE If, for example, \fIwid\fR is 8 and \fIprec\fR is 3, then /fBawk\fR will concatenate everything to the left of the comma in the \fIprintf\fR statement, and the statement will really be .DS printf(%8.3d\en, val) .DE These could, of course, been assigned to some variable \fIfmt\fR before being used: .DS fmt = "%" wid "." prec "d" printf(fmt "\en", val) .DE Note, however, that the newline ("\en") in the second form \fIcannot\fR be included in the assignment to \fIfmt\fR. .bp .NH Undocumented Features .LP There are several undocumented features: .IP Variable values may be established on the command line. .IP A .B getline function exists that reads the next input line and starts processing it immediately. .IP Regular expressions accept octal representations of characters. .IP A .B -d flag argument produces debugging output if .B awk was compiled with "DEBUG" defined. .IP Scripts may be "compiled" and run later (providing the installer did what is necessary to make this work). .NH 2 Defining Variables On The Command Line .LP To pass variable values into a script at run time, you may use .IP .I variable=value .LP (as many as you like) between any "\fB-f \fIscriptname\fR" or .I program and the names of any files to be processed. For example, .DS awk -f awkscript today=\e"`date`\e" infile .DE would establish for .I awkscript a variable named .B today that had as its value the output of the .B date command. .LP There are a number of caveats: .IP Such assignments may appear only between .B -f .I awkscript (or \fIprogram\fR or [see below] \fB-R\fIawk.out\fR) and the name of any input file (or '-'). .IP Each .I variable=value combination must be a single argument (i.e. there must not be spaces around the '=' sign); .I value may be either a numeric value or a string. If it is a string, it must be enclosed in double quotes at the time \fBawk\fR reads the argument. That means that the double quotes enclosing \fIvalue\fR on the command line must be protected from the shell as in the example above or it will remove them. .IP .I Variable is not available for use within the script until after the first record has been read and parsed, but it is available as soon as that has occurred so that it may be used before any other processing begins. It does not exist at the time the .B BEGIN block is executed, and if there was no input it will not exist in the .B END block (if any). .NH 2 Getline Function .LP .B Getline immediately reads the next input line (which is parsed into \fI$1\fR, \fI$2\fR, etc) and starts processing it at the location of the call (as opposed to .B next which immediately reads the next input line but starts processing from the start of the script). .LP .B Getline facilitates performing some types of tasks such as processing files with multiline records and merging information from several files. To use the latter as an example, consider a case where two files, whose lines do not share a common format, must be processed together. Shell and \fBawk\fR scripts to do this might look something like .sp In the shell script .DS ( echo DATA1; cat datafile1; echo ENDdata1 \e echo DATA2; cat datafile2; echo ENDdata2 \e ) | \e awk -f awkscript - > awk_output_file .DE In the .B awk script .DS /^DATA1/ { # Next input line starts datafile1 while (getline && $1 !~ /^ENDdata1$/) { [processing for \fIdata1\fR lines] } } .sp 1 /^DATA2/ { # Next input line starts datafile2 while (getline && $1 !~ /^ENDdata2$/) { [processing for \fIdata2\fR lines] } } .DE There are, of course, other ways of accomplishing this particular task (primarily using \fBsed\fR to preprocess the information), but they are generally more difficult to write and more subject to logic errors. Many cases arising in practice are significantly more difficult, if not impossible, to handle without \fBgetline\fR. .NH 2 Regular Expressions .LP The sequence "\fI\eddd\fR" (where 'd' is a digit) may be used to include explicit octal values in regular expressions. This is often useful if "nonprinting" characters have been used as "markers" in a file. It has not been tested for ASCII values outside the range 01 through 0127. .NH 2 Debugging output .LP [This is unlikely to be of interest to the casual user.] .sp If \fBawk\fR was compiled with "DEBUG" defined, then giving it a .B -d flag argument will cause it to produce debugging output when it is run. This is sometimes useful in finding obscure problems in scripts, though it is primarily intended for tracking down problems with \fBawk\fR itself. .NH 2 Script "Compilation" .LP [It is likely that this does not work at most sites. If it does not, the following will probably not be of interest to the casual user.] .sp The command .DS awk -S -f script.awk .DE produces a file named .B awk.out. This is a core image of .B awk after parsing the file .I script.awk. The command .DS awk -Rawk.out datafile .DE causes .B awk.out to be applied to \fIdatafile\fR (or the standard input if no input file is given). This avoids having to reparse large scripts each time they are used. Unfortunately, the way this is implemented requires some special action on the part of the person installing \fBawk\fR. .LP As \fBawk\fR is delivered with \fI4.2 BSD\fR (and \fI4.3 BSD\fR), .I awk.out is created by the \fBawk -S ...\fR process by calling .B sbrk() with '0', writing out the returned value, then writing out the core image from location 0 to the returned address. The \fBawk -R...\fR process reads the first word of .I awk.out to get the length of the image, calls .B brk() with that length, and then reads the image into itself starting at location 0. For this to work, \fBawk\fR must have been loaded with its text segment writeable. Unfortunately, the \fIBSD\fR default for \fBld\fR is to load with the text read-only and shareable. Thus, the installer must remember to take special action (e.g. "cc -N ..." [equivalently "ld -N ..."] for \fI4BSD\fR) if these flags are to work. .LP [Personally, I don't think it is a very good idea to give \fBawk\fR the opportunity to write on its text segment; I changed it so that only the data segment is overwritten.] .LP Also, due to what appears to be a lapse in logic, the first non-flag argument following \fB-R\fIawk.out\fR is discarded. [Disliking that behavior, the I changed it so that the \fB-R\fR flag is treated like the \fB-f\fR flag: no flag arguments may follow it.] .bp .NH Sketchily Documented Features .LP .NH 2 Exit .LP The user manual says that using the .B exit function causes the script to behave as if end-of-input has been reached. Not menitoned explicitly is the fact that this will cause the .B END block to be executed if it exists. Also, two things are ommitted: .IP \fBexit(\fIexpr\fB)\fR causes the script's exit status to be set to the value of \fIexpr\fR. .IP If .B exit is called within the .B END block, the script exits immediately. .NH 2 Mathematical Functions .LP The following builtin functions exist and are mentioned in .I awk(1) but not in the user manual. .IP \fBint(\fIx\fB)\fR 10 \fIx\fR trunctated to an integer. .IP \fBsqrt(\fIx\fB)\fR 10 the square root of \fIx\fR for \fIx\fR >= 0, otherwise zero. .IP \fBexp(\fIx\fB)\fR 10 \fBe\fR-to-the-\fIx\fR for -88 <= \fIx\fR <= 88, zero for \fIx\fR < -88, and dumps core for \fIx\fR > 88. .IP \fBlog(\fIx\fB)\fR 10 the natural log of \fIx\fR. .NH 2 OFMT Variable .LP The variable .B OFMT may be set to, e.g. "%.2f", and purely numerical output will be bound by that restriction in .B print statements. The default value is "%.6g". Again, this is mentioned in .I awk(1) but not in the user manual. .NH 2 Array Elements .LP The user manual states that "Array elements ... spring into existence by being mentioned." This is literally true; .I any reference to an array element causes it to exist. ("I was thought about, therefore I am.") Take, for example, .DS if(array[$1] == "blah") { [process blah lines] } .DE If there is not an existing element of .B array whose subscript is the same as the contents of the current line's first field, .I one is created .R and its value (null, of course) is then compared with "blah". This can be a bit disconcerting, particularly when later processing is using .DS for (i in \fBarray\fR) { [do something with result of processing "blah" lines] } .DE to walk the array and expects all the elements to be non-null. Succinct practical examples are difficult to construct, but when this happens in a 500 line script it can be difficult to determine what has gone wrong. .NH 2 FS and Input Fields .LP By default any number of spaces or tabs can separate fields (i.e. there are no null input fields) and trailing spaces and tabs are ignored. However, if .B FS is explicitly set to any character other than a space (e.g., a tab: \fBFS = "\et"\fR), then a field is defined by each such character and trailing field separator characters are not ignored. For example, if '>' represents a tab then .DS one>>three>>five> .DE defines six fields, with fields two, four, and six being empty. .LP If .B FS is explicitly set to a space (\fBFS\fR = "\ "), then the default behavior obtains (this may be a bug); that is, both spaces and tabs are taken as field separators, there can be no null input fields, and trailing spaces and tabs are ignored. .NH 2 RS and Input Records .LP If .B RS is explicitly set to the null string (\fBRS\fR = ""), then the input record separator becomes a blank line, and the newlines at the end of input lines is a field separator. This facilitates handling multiline records. .NH 2 "Fall Through" .LP This is mentioned in the user manual, but it is important enough that it is worth pointing out here, also. .LP In the script .DS /\fIpattern_1\fR/ { [do something] } .sp /\fIpattern_2\fR/ { [do something] } .DE all input lines will be compared with both .I pattern_1 and .I pattern_2 unless the .B next function is used before the closing '}' in the .I pattern_1 portion. .NH 2 Output Redirection .LP Once a file (or pipe) is opened by .B awk it is not closed until .B awk exits. This can occassionally cause problems. For example, it means that a script that sorts its input lines into output files named by the contents of their first fields (similar to an example in the user manual) .DS { print $0 > $1 } .DE is going to fail if the number of different first fields exceeds about 10. This problem .I cannot be avoided by using something like .DS { command = "cat >> " $1 print $0 | command } .DE as the value of the variable .B command is different for each different value of .I $1 and is therefore treated as a different output "file". .LP [I have not been able to create a truly satisfactory fix for this that doesn't involve having \fBawk\fR treat output redirection to pipes differently from output to files; I would greatly appreciate hearing of one.] .NH 2 Field and Variable Types, Values, and Comparisons .LP The following is a synopsis of notes included with \fBawk\fR's source code. .NH 3 Types .LP Variables and fields can be strings or numbers or both. .NH 4 Variable Types .LP When a variable is set by the assignment .DS \fIvar\fR = \fIexpr\fR .DE its type is set to the type of .I expr (this includes +=, ++, etc). An arithmetic expression is of type .I number, a concatenation is of type .I string, etc. If the assignment is a simple copy, e.g. .DS \fIvar1\fR = \fIvar2\fR .DE then the type of .I var1 becomes that of .I var2. .LP Type is determined by context; rarely, but always very inconveniently, this context-determined type is incorrect. As mentioned in .I awk(1) the type of an expression can be coerced to that desired. E.g. .DS { \fIexpr1\fR + 0 .sp 1 \fIexpr2\fR "" # Concatenate with a null string } .DE coerces .I expr1 to numeric type and .I expr2 to string type. .NH 4 Field Types .LP As with variables, the type of a field is determined by context when possible, e.g. .RS .IP $1++ 8 clearly implies that \fI$1\fR is to be numeric, and .IP $1\ =\ $1\ ","\ $2 16 implies that $1 and $2 are both to be strings. .RE .LP Coercion is done as needed. In contexts where types cannot be reliably determined, e.g., .DS if($1 == $2) ... .DE the type of each field is determined on input by inspection. All fields are strings; in addition, each field that contains only a number is also considered numeric. Thus, the test .DS if($1 == $2) ... .DE will succeed on the inputs .DS 0 0.0 100 1e2 +100 100 1e-3 1e-3 .DE and fail on the inputs .DS (null) 0 (null) 0.0 2E-518 6E-427 .DE "only a number" in this case means matching the regular expression .DS ^[+-]?[0-9]*\e.?[0-9]+(e[+-]?[0-9]+)?$ .DE .NH 3 Values .LP Uninitialized variables have the numeric value 0 and the string value "". Therefore, if \fIx\fR is uninitialized, .DS if(x) ... if (x == "0") ... .DE are false, and .DS if(!x) ... if(x == 0) ... if(x == "") ... .DE are true. .LP Fields which are explicitly null have the string value "", and are not numeric. Non-existent fields (i.e., fields past \fBNF\fR) are also treated this way. .NH 3 Types of Comparisons .LP If both operands are numeric, the comparison is made numerically. Otherwise, operands are coerced to type string if necessary, and the comparison is made on strings. .NH 3 Array Elements .LP Array elements created by .B split are treated in the same way as fields. From arnold at skeeve.com Wed Jun 28 16:45:19 2023 From: arnold at skeeve.com (arnold at skeeve.com) Date: Wed, 28 Jun 2023 00:45:19 -0600 Subject: [TUHS] Trying to date "A Supplemental Document For Awk" In-Reply-To: References: Message-ID: <202306280645.35S6jJKB008257@freefriends.org> Hmmm, skimming the file for the first time in a long time, I see that he references 4.3 BSD as well. Clearly, this document evolved over time. I would still be interested in earlier versions if anyone has. Thanks, Arnold Aharon Robbins wrote: > Hi All. > > Attached is "A Supplemental Document For Awk". This circulated on USENET > in the 80s. My copy is dated January 18, 1989, but I'm sure it's > older than that. One clue is the reference to the 4.2 BSD manual, > and 4.3 came out already in 1986 or so. > > Does anyone else have a copy of this with perhaps an older date? > > As far as I can tell from a short search, the author is no > longer living. If someone knows better and can provide contact > info for him, that'd be great. > > In the meantime, Warren, do you want to add it to the archives? > > Thanks! > > Arnold From ats at offog.org Thu Jun 29 03:48:47 2023 From: ats at offog.org (Adam Sampson) Date: Wed, 28 Jun 2023 18:48:47 +0100 Subject: [TUHS] Trying to date "A Supplemental Document For Awk" In-Reply-To: References: Message-ID: On Wed, Jun 28, 2023 at 09:26:02AM +0300, Aharon Robbins wrote: > Attached is "A Supplemental Document For Awk". This circulated on > USENET in the 80s. My copy is dated January 18, 1989, but I'm sure > it's older than that. In the utzoo Usenet archive, there are two versions of this document and a few mentions of it... John Pierce posted to comp.unix.questions on 1989-04-02, saying he'd written it "four or five years ago". Stu Heiss, in comp.unix.questions on 1989-03-06, said it was "posted to net.sources 18 Jun 86 with message-id 238 at sdchema.sdchem.uucp". Unfortunately this isn't in the utzoo archive or the net.sources.mbox in archive.org's Usenet Historical Collection. A copy identical to yours was posted by Jim Harkins to comp.unix.questions on 1990-03-29. There's a later version, fixing a typo and some formatting and adding a mention of \f and \b in printf, which was posted by Brian Kantor to comp.doc on 1987-10-11 -- I've attached this. The same file (with two .bps commented out) was reposted in comp.unix.questions on 1989-11-16 by Francois-Michel Lang. Thanks, -- Adam Sampson -------------- next part -------------- Relay-Version: version B 2.10 5/3/83; site utzoo.UUCP Path: utzoo!mnetor!uunet!seismo!esosun!ucsdhub!sdcsvax!brian From: brian at sdcsvax.UCSD.EDU (Brian Kantor) Newsgroups: comp.doc Subject: AWK supplementary document - troff with 'ms' macros Message-ID: <4070 at sdcsvax.UCSD.EDU> Date: Sun, 11-Oct-87 02:40:02 EDT Article-I.D.: sdcsvax.4070 Posted: Sun Oct 11 02:40:02 1987 Date-Received: Mon, 12-Oct-87 21:20:14 EDT Sender: root at sdcsvax.UCSD.EDU Organization: UCSD wombat breeding society Lines: 745 Approved: brian at cyberpunk.ucsd.edu .RP .TL .B A Supplemental Document For AWK .sp .R - or - .sp .I Things Al, Pete, And Brian Didn't Mention Much .R .AU John W. Pierce .AI Department of Chemistry University of California, San Diego La Jolla, California 92093 jwp%chem at sdcsvax.ucsd.edu .AB As .B awk and its documentation are distributed with .I 4.2 BSD UNIX* .R there are a number of bugs, undocumented features, and features that are touched on so briefly in the documentation that the casual user may not realize their full significance. While this document applies primarily to the \fI4.2 BSD\fR version of \fIUNIX\fR, it is known that the \fI4.3 BSD\fR version does not have all of the bugs fixed, and that it does not have updated documentation. The situation with respect to the versions of \fBawk\fR distributed with other versions \fIUNIX\fR and similar systems is unknown to the author. .FS *UNIX is a trademark of AT&T .FE .AE .LP In this document references to "the user manual" mean .I Awk - A Pattern Scanning and Processing Language (Second Edition) .R by Aho, Kernighan, and Weinberger. References to "awk(1)" mean the entry for .B awk in the .I UNIX Programmer's Manual, 4th Berkeley Distribution. .R References to "the documentation" mean both of those. .LP In most examples, the outermost set of braces ('{ }') have been ommitted. They would, of course, be necessary in real scripts. .NH Known Bugs .LP There are three main bugs known to me. They involve: .IP Assignment to input fields. .IP Piping output to a program from within an \fBawk\fR script. .IP Using '*' in \fIprintf\fR field width and precision specifications does not work, nor do '\\f' and '\\b' print formfeed and backspace respectively. .NH 2 Assignment to Input Fields .LP [This problem is partially fixed in \fI4.3BSD\fR; see the last paragraph of this section regarding the unfixed portion.] .LP The user manual states that input fields may be objects of assignment statements. Given the input line .DS field_one field_two field_three .DE the script .DS $2 = "new_field_2" print $0 .DE should print .DS field_one new_field_2 field_three .DE .LP This does not work; it will print .DS field_one field_two field_three .DE That is, the script will behave as if the assignment to $2 had not been made. However, explicitly referencing an "assigned to" field .I does recognize that the assignment has been made. If the script .DS $2 = "new_field_2" print $1, $2, $3 .DE is given the same input it will [properly] print .DS field_one new_field_2 field_three .DE Therefore, you can get around this bug with, e.g., .DS $2 = "new_field_2" output = $1 # Concatenate output fields for(i = 2; i <= NF; ++i) # into a single output line output = output OFS $i # with OFS between fields print output .DE .LP In \fI4.3BSD\fR, this bug has been fixed to the extent that the failing example above works correctly. However, a script like .DS $2 = "new_field_2" var = $0 print var .DE still gives incorrect output. This problem can be bypassed by using .DS \fIvar\fR = sprintf("%s", $0) .DE instead of "\fIvar\fR = $0"; \fIvar\fR will have the correct value. .NH 2 Piping Output to a Program .LP [This problem appears to have been fixed in \fI4.3BSD\fR, but that has not been exhaustively tested.] .LP The user manual states that .I print and .I printf statements may write to a program using, e.g., .DS print | "\fIcommand\fR" .DE This would pipe the output into \fIcommand\fR, and it does work. However, you should be aware that this causes .B awk to spawn a child process (\fIcommand\fR), and that it .I does not .R wait for the child to exit before it exits itself. In the case of a "slow" command like .B sort, .B awk may exit before .I command has finished. .LP This can cause problems in, for example, a shell script that depends on everything done by .B awk being finished before the next shell command is executed. Consider the shell script .DS awk -f awk_script input_file mv sorted_output somewhere_else .DE and the .B awk script .DS print output_line | "sort -o sorted_output" .DE If .I input_file is large .B awk will exit long before .B sort is finished. That means that the .B mv command will be executed before .B sort is finished, and the result is unlikely to be what you wanted. Other than fixing the source, there is no way to avoid this problem except to handle such pipes outside of the awk script, e.g. .DS awk -f awk_file input_file | sort -o sorted_output mv sorted_output somewhere_else .DE which is not wholly satisfactory. .LP See .I Sketchily Documented Features .R below for other considerations in redirecting output from within an .B awk script. .NH 2 Printf and '*', '\\f', and '\\b' .LP The document says that the \fIprintf\fR function provided is identical to the \fIprintf\fR provided by the \fIC\fR language \fBstdio\fR package. This is incorrect: '*' cannot be used to specify a field width or precision, and '\\f' and '\\b' cannot be used to print formfeeds and backspaces. .LP The command .DS printf("%*.s", len, string) .DE will cause a core dump. Given \fBawk\fR's age, it is likely that its \fIprintf\fR was written well before the use of '*' for specifying field width and precision appeared in the \fBstdio\fR library's \fIprintf\fR. Another possibility is that it wasn't implemented because it isn't really needed to achieve the same effect. .LP To accomplish this effect, you can utilize the fact that \fBawk\fR concatenates variables before it does any other processing on them. For example, assume a script has two variables \fIwid\fR and \fIprec\fR which control the width and precision used for printing another variable \fIval\fI: .DS [code to set "wid", "prec", and "val"] printf("%" wid "." prec "d\en", val) .DE If, for example, \fIwid\fR is 8 and \fIprec\fR is 3, then /fBawk\fR will concatenate everything to the left of the comma in the \fIprintf\fR statement, and the statement will really be .DS printf(%8.3d\en, val) .DE These could, of course, been assigned to some variable \fIfmt\fR before being used: .DS fmt = "%" wid "." prec "d" printf(fmt "\en", val) .DE Note, however, that the newline ("\en") in the second form \fIcannot\fR be included in the assignment to \fIfmt\fR. .LP To allow use of '\\f' and '\\b', \fBawk\fR's \fIlex\fR script must be changed. This is trivial to do (it is done at the point where '\\n' and '\\t' are processed), but requires having source code. [I have fixed this and have not seen any unwanted effects.] .bp .NH Undocumented Features .LP There are several undocumented features: .IP Variable values may be established on the command line. .IP A .B getline function exists that reads the next input line and starts processing it immediately. .IP Regular expressions accept octal representations of characters. .IP A .B -d flag argument produces debugging output if .B awk was compiled with "DEBUG" defined. .IP Scripts may be "compiled" and run later (providing the installer did what is necessary to make this work). .NH 2 Defining Variables On The Command Line .LP To pass variable values into a script at run time, you may use .IP .I variable=value .LP (as many as you like) between any "\fB-f \fIscriptname\fR" or .I program and the names of any files to be processed. For example, .DS awk -f awkscript today=\e"`date`\e" infile .DE would establish for .I awkscript a variable named .B today that had as its value the output of the .B date command. .LP There are a number of caveats: .IP Such assignments may appear only between .B -f .I awkscript (or \fIprogram\fR or [see below] \fB-R\fIawk.out\fR) and the name of any input file (or '-'). .IP Each .I variable=value combination must be a single argument (i.e. there must not be spaces around the '=' sign); .I value may be either a numeric value or a string. If it is a string, it must be enclosed in double quotes at the time \fBawk\fR reads the argument. That means that the double quotes enclosing \fIvalue\fR on the command line must be protected from the shell as in the example above or it will remove them. .IP .I Variable is not available for use within the script until after the first record has been read and parsed, but it is available as soon as that has occurred so that it may be used before any other processing begins. It does not exist at the time the .B BEGIN block is executed, and if there was no input it will not exist in the .B END block (if any). .NH 2 Getline Function .LP .B Getline immediately reads the next input line (which is parsed into \fI$1\fR, \fI$2\fR, etc) and starts processing it at the location of the call (as opposed to .B next which immediately reads the next input line but starts processing from the start of the script). .LP .B Getline facilitates performing some types of tasks such as processing files with multiline records and merging information from several files. To use the latter as an example, consider a case where two files, whose lines do not share a common format, must be processed together. Shell and \fBawk\fR scripts to do this might look something like .sp In the shell script .DS ( echo DATA1; cat datafile1; echo ENDdata1 \e echo DATA2; cat datafile2; echo ENDdata2 \e ) | \e awk -f awkscript - > awk_output_file .DE In the .B awk script .DS /^DATA1/ { # Next input line starts datafile1 while (getline && $1 !~ /^ENDdata1$/) { [processing for \fIdata1\fR lines] } } .sp 1 /^DATA2/ { # Next input line starts datafile2 while (getline && $1 !~ /^ENDdata2$/) { [processing for \fIdata2\fR lines] } } .DE There are, of course, other ways of accomplishing this particular task (primarily using \fBsed\fR to preprocess the information), but they are generally more difficult to write and more subject to logic errors. Many cases arising in practice are significantly more difficult, if not impossible, to handle without \fBgetline\fR. .NH 2 Regular Expressions .LP The sequence "\fI\eddd\fR" (where 'd' is a digit) may be used to include explicit octal values in regular expressions. This is often useful if "nonprinting" characters have been used as "markers" in a file. It has not been tested for ASCII values outside the range 01 through 0127. .NH 2 Debugging output .LP [This is unlikely to be of interest to the casual user.] .sp If \fBawk\fR was compiled with "DEBUG" defined, then giving it a .B -d flag argument will cause it to produce debugging output when it is run. This is sometimes useful in finding obscure problems in scripts, though it is primarily intended for tracking down problems with \fBawk\fR itself. .NH 2 Script "Compilation" .LP [It is likely that this does not work at most sites. If it does not, the following will probably not be of interest to the casual user.] .sp The command .DS awk -S -f script.awk .DE produces a file named .B awk.out. This is a core image of .B awk after parsing the file .I script.awk. The command .DS awk -Rawk.out datafile .DE causes .B awk.out to be applied to \fIdatafile\fR (or the standard input if no input file is given). This avoids having to reparse large scripts each time they are used. Unfortunately, the way this is implemented requires some special action on the part of the person installing \fBawk\fR. .LP As \fBawk\fR is delivered with \fI4.2 BSD\fR (and \fI4.3 BSD\fR), .I awk.out is created by the \fBawk -S ...\fR process by calling .B sbrk() with '0', writing out the returned value, then writing out the core image from location 0 to the returned address. The \fBawk -R...\fR process reads the first word of .I awk.out to get the length of the image, calls .B brk() with that length, and then reads the image into itself starting at location 0. For this to work, \fBawk\fR must have been loaded with its text segment writeable. Unfortunately, the \fIBSD\fR default for \fBld\fR is to load with the text read-only and shareable. Thus, the installer must remember to take special action (e.g. "cc -N ..." [equivalently "ld -N ..."] for \fI4BSD\fR) if these flags are to work. .LP [Personally, I don't think it is a very good idea to give \fBawk\fR the opportunity to write on its text segment; I changed it so that only the data segment is overwritten.] .LP Also, due to what appears to be a lapse in logic, the first non-flag argument following \fB-R\fIawk.out\fR is discarded. [Disliking that behavior, the I changed it so that the \fB-R\fR flag is treated like the \fB-f\fR flag: no flag arguments may follow it.] .bp .NH Sketchily Documented Features .LP .NH 2 Exit .LP The user manual says that using the .B exit function causes the script to behave as if end-of-input has been reached. Not menitoned explicitly is the fact that this will cause the .B END block to be executed if it exists. Also, two things are ommitted: .IP \fBexit(\fIexpr\fB)\fR causes the script's exit status to be set to the value of \fIexpr\fR. .IP If .B exit is called within the .B END block, the script exits immediately. .NH 2 Mathematical Functions .LP The following builtin functions exist and are mentioned in .I awk(1) but not in the user manual. .IP \fBint(\fIx\fB)\fR 10 \fIx\fR trunctated to an integer. .IP \fBsqrt(\fIx\fB)\fR 10 the square root of \fIx\fR for \fIx\fR >= 0, otherwise zero. .IP \fBexp(\fIx\fB)\fR 10 \fBe\fR-to-the-\fIx\fR for -88 <= \fIx\fR <= 88, zero for \fIx\fR < -88, and dumps core for \fIx\fR > 88. .IP \fBlog(\fIx\fB)\fR 10 the natural log of \fIx\fR. .NH 2 OFMT Variable .LP The variable .B OFMT may be set to, e.g. "%.2f", and purely numerical output will be bound by that restriction in .B print statements. The default value is "%.6g". Again, this is mentioned in .I awk(1) but not in the user manual. .NH 2 Array Elements .LP The user manual states that "Array elements ... spring into existence by being mentioned." This is literally true; .I any reference to an array element causes it to exist. ("I was thought about, therefore I am.") Take, for example, .DS if(array[$1] == "blah") { [process blah lines] } .DE If there is not an existing element of .B array whose subscript is the same as the contents of the current line's first field, .I one is created .R and its value (null, of course) is then compared with "blah". This can be a bit disconcerting, particularly when later processing is using .DS for (i in \fBarray\fR) { [do something with result of processing "blah" lines] } .DE to walk the array and expects all the elements to be non-null. Succinct practical examples are difficult to construct, but when this happens in a 500 line script it can be difficult to determine what has gone wrong. .NH 2 FS and Input Fields .LP By default any number of spaces or tabs can separate fields (i.e. there are no null input fields) and trailing spaces and tabs are ignored. However, if .B FS is explicitly set to any character other than a space (e.g., a tab: \fBFS = "\et"\fR), then a field is defined by each such character and trailing field separator characters are not ignored. For example, if '>' represents a tab then .DS one>>three>>five> .DE defines six fields, with fields two, four, and six being empty. .LP If .B FS is explicitly set to a space (\fBFS\fR = "\ "), then the default behavior obtains (this may be a bug); that is, both spaces and tabs are taken as field separators, there can be no null input fields, and trailing spaces and tabs are ignored. .NH 2 RS and Input Records .LP If .B RS is explicitly set to the null string (\fBRS\fR = ""), then the input record separator becomes a blank line, and the newlines at the end of input lines is a field separator. This facilitates handling multiline records. .NH 2 "Fall Through" .LP This is mentioned in the user manual, but it is important enough that it is worth pointing out here, also. .LP In the script .DS /\fIpattern_1\fR/ { [do something] } .sp /\fIpattern_2\fR/ { [do something] } .DE all input lines will be compared with both .I pattern_1 and .I pattern_2 unless the .B next function is used before the closing '}' in the .I pattern_1 portion. .NH 2 Output Redirection .LP Once a file (or pipe) is opened by .B awk it is not closed until .B awk exits. This can occassionally cause problems. For example, it means that a script that sorts its input lines into output files named by the contents of their first fields (similar to an example in the user manual) .DS { print $0 > $1 } .DE is going to fail if the number of different first fields exceeds about 10. This problem .I cannot be avoided by using something like .DS { command = "cat >> " $1 print $0 | command } .DE as the value of the variable .B command is different for each different value of .I $1 and is therefore treated as a different output "file". .LP [I have not been able to create a truly satisfactory fix for this that doesn't involve having \fBawk\fR treat output redirection to pipes differently from output to files; I would greatly appreciate hearing of one.] .NH 2 Field and Variable Types, Values, and Comparisons .LP The following is a synopsis of notes included with \fBawk\fR's source code. .NH 3 Types .LP Variables and fields can be strings or numbers or both. .NH 4 Variable Types .LP When a variable is set by the assignment .DS \fIvar\fR = \fIexpr\fR .DE its type is set to the type of .I expr (this includes +=, ++, etc). An arithmetic expression is of type .I number, a concatenation is of type .I string, etc. If the assignment is a simple copy, e.g. .DS \fIvar1\fR = \fIvar2\fR .DE then the type of .I var1 becomes that of .I var2. .LP Type is determined by context; rarely, but always very inconveniently, this context-determined type is incorrect. As mentioned in .I awk(1) the type of an expression can be coerced to that desired. E.g. .DS { \fIexpr1\fR + 0 .sp 1 \fIexpr2\fR "" # Concatenate with a null string } .DE coerces .I expr1 to numeric type and .I expr2 to string type. .NH 4 Field Types .LP As with variables, the type of a field is determined by context when possible, e.g. .RS .IP $1++ 8 clearly implies that \fI$1\fR is to be numeric, and .IP $1\ =\ $1\ ","\ $2 16 implies that $1 and $2 are both to be strings. .RE .LP Coercion is done as needed. In contexts where types cannot be reliably determined, e.g., .DS if($1 == $2) ... .DE the type of each field is determined on input by inspection. All fields are strings; in addition, each field that contains only a number is also considered numeric. Thus, the test .DS if($1 == $2) ... .DE will succeed on the inputs .DS 0 0.0 100 1e2 +100 100 1e-3 1e-3 .DE and fail on the inputs .DS (null) 0 (null) 0.0 2E-518 6E-427 .DE "only a number" in this case means matching the regular expression .DS ^[+-]?[0-9]*\e.?[0-9]+(e[+-]?[0-9]+)?$ .DE .NH 3 Values .LP Uninitialized variables have the numeric value 0 and the string value "". Therefore, if \fIx\fR is uninitialized, .DS if(x) ... if (x == "0") ... .DE are false, and .DS if(!x) ... if(x == 0) ... if(x == "") ... .DE are true. .LP Fields which are explicitly null have the string value "", and are not numeric. Non-existent fields (i.e., fields past \fBNF\fR) are also treated this way. .NH 3 Types of Comparisons .LP If both operands are numeric, the comparison is made numerically. Otherwise, operands are coerced to type string if necessary, and the comparison is made on strings. .NH 3 Array Elements .LP Array elements created by .B split are treated in the same way as fields. From ken.unix.guy at gmail.com Thu Jun 29 04:03:53 2023 From: ken.unix.guy at gmail.com (KenUnix) Date: Wed, 28 Jun 2023 14:03:53 -0400 Subject: [TUHS] Trying to date "A Supplemental Document For Awk" In-Reply-To: References: Message-ID: Guys, It's been too long. What would I use to compile this man page source? I do remember some option switches are required. Yes? Thanks On Wed, Jun 28, 2023 at 1:49 PM Adam Sampson wrote: > On Wed, Jun 28, 2023 at 09:26:02AM +0300, Aharon Robbins wrote: > > Attached is "A Supplemental Document For Awk". This circulated on > > USENET in the 80s. My copy is dated January 18, 1989, but I'm sure > > it's older than that. > > In the utzoo Usenet archive, there are two versions of this document and > a few mentions of it... > > John Pierce posted to comp.unix.questions on 1989-04-02, saying he'd > written it "four or five years ago". > > Stu Heiss, in comp.unix.questions on 1989-03-06, said it was "posted to > net.sources 18 Jun 86 with message-id 238 at sdchema.sdchem.uucp". > Unfortunately this isn't in the utzoo archive or the net.sources.mbox > in archive.org's Usenet Historical Collection. > > A copy identical to yours was posted by Jim Harkins to > comp.unix.questions on 1990-03-29. > > There's a later version, fixing a typo and some formatting and adding a > mention of \f and \b in printf, which was posted by Brian Kantor to > comp.doc on 1987-10-11 -- I've attached this. The same file (with two > .bps commented out) was reposted in comp.unix.questions on 1989-11-16 by > Francois-Michel Lang. > > Thanks, > > -- > Adam Sampson > -- End of line JOB TERMINATED -->> Okey Dokey, OK Boss -------------- next part -------------- An HTML attachment was scrubbed... URL: From clemc at ccc.com Thu Jun 29 04:38:40 2023 From: clemc at ccc.com (Clem Cole) Date: Wed, 28 Jun 2023 14:38:40 -0400 Subject: [TUHS] Trying to date "A Supplemental Document For Awk" In-Reply-To: References: Message-ID: Download the file and make sure you save it in "UNIX" format, not DOS ( *i.e.* newline delimited not the nasty cruft) -- (if you are not sure how to do that running the dos2unix(1) command will assure it's was not unix format when you are done). % file awkdoc awkdoc: troff or preprocessor input text, ASCII text % man groff We'll leave it to you to figure out which switches for troff/groff and macro package (hint: try the head(1) command to peak at the first few lines -- there are three likely choices, but it's pretty obvious since the same one as most V7 documents). FWIW: If you got a copy of Kernighan and Pike's - "The Unix Programming Environment" [ISBN 0-13-937699-2] which is available at most retailers. You can read Chapter 9 for this question. Although, given so many of the questions you seem to like to ask here, please consider doing all the exercises in the entire book. ᐧ On Wed, Jun 28, 2023 at 2:04 PM KenUnix wrote: > Guys, > > It's been too long. What would I use to compile this man page source? > > I do remember some option switches are required. Yes? > > Thanks > > > On Wed, Jun 28, 2023 at 1:49 PM Adam Sampson wrote: > >> On Wed, Jun 28, 2023 at 09:26:02AM +0300, Aharon Robbins wrote: >> > Attached is "A Supplemental Document For Awk". This circulated on >> > USENET in the 80s. My copy is dated January 18, 1989, but I'm sure >> > it's older than that. >> >> In the utzoo Usenet archive, there are two versions of this document and >> a few mentions of it... >> >> John Pierce posted to comp.unix.questions on 1989-04-02, saying he'd >> written it "four or five years ago". >> >> Stu Heiss, in comp.unix.questions on 1989-03-06, said it was "posted to >> net.sources 18 Jun 86 with message-id 238 at sdchema.sdchem.uucp". >> Unfortunately this isn't in the utzoo archive or the net.sources.mbox >> in archive.org's Usenet Historical Collection. >> >> A copy identical to yours was posted by Jim Harkins to >> comp.unix.questions on 1990-03-29. >> >> There's a later version, fixing a typo and some formatting and adding a >> mention of \f and \b in printf, which was posted by Brian Kantor to >> comp.doc on 1987-10-11 -- I've attached this. The same file (with two >> .bps commented out) was reposted in comp.unix.questions on 1989-11-16 by >> Francois-Michel Lang. >> >> Thanks, >> >> -- >> Adam Sampson >> > > > -- > End of line > JOB TERMINATED -->> Okey Dokey, OK Boss > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From grog at lemis.com Thu Jun 29 09:47:50 2023 From: grog at lemis.com (Greg 'groggy' Lehey) Date: Thu, 29 Jun 2023 09:47:50 +1000 Subject: [TUHS] Trying to date "A Supplemental Document For Awk" In-Reply-To: References: Message-ID: <20230628234750.GE43966@eureka.lemis.com> On Wednesday, 28 June 2023 at 14:38:40 -0400, Clem Cole wrote: > On Wed, Jun 28, 2023 at 2:04 PM KenUnix wrote: > >> It's been too long. What would I use to compile this man page source? >> >> I do remember some option switches are required. Yes? > > Download the file and make sure you save it in "UNIX" format, not DOS ( > *i.e.* newline delimited not the nasty cruft) -- (if you are not > sure how to do that running the dos2unix(1) command will assure it's was > not unix format when you are done). > > % file awkdoc > awkdoc: troff or preprocessor input text, ASCII text > % man groff There's also grog (groff guess) that may help. It's not very clever, but it recognizes a number of formats: $ grog ls.1 groff -mdoc ls.1 Greg -- Sent from my desktop computer. Finger grog at lemis.com for PGP public key. See complete headers for address and phone numbers. This message is digitally signed. If your Microsoft mail program reports problems, please read http://lemis.com/broken-MUA.php -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 163 bytes Desc: not available URL: From reed at reedmedia.net Thu Jun 29 10:26:04 2023 From: reed at reedmedia.net (Jeremy C. Reed) Date: Thu, 29 Jun 2023 00:26:04 +0000 (UTC) Subject: [TUHS] Trying to date "A Supplemental Document For Awk" In-Reply-To: References: Message-ID: I found a copy from 1986 in usenix89/Lang/Awk_doc/:STUFF (the file is called :STUFF) from a tar usenix878889.tar.gz I didn't check but I assume it is one here https://www.tuhs.org/Archive/Applications/Shoppa_Tapes/ Path: plus5!wuphys!wucs!we53!ltuxa!cuae2!ihnp4!mhuxn!mhuxr!ulysses!ucbvax!sdcsvax!sdchem!jwp From: jwp at sdchem.UUCP (John Pierce) Newsgroups: net.sources Subject: Awk document Message-ID: <238 at sdchema.sdchem.UUCP> Date: 18 Jun 86 20:04:32 GMT Reply-To: jwp at sdchem.UUCP (John Pierce) Organization: Chemistry Dept, UC San Diego Lines: 743 Posted: Wed Jun 18 15:04:32 1986 From bakul at iitbombay.org Thu Jun 29 11:04:13 2023 From: bakul at iitbombay.org (Bakul Shah) Date: Wed, 28 Jun 2023 18:04:13 -0700 Subject: [TUHS] Trying to date "A Supplemental Document For Awk" In-Reply-To: References: Message-ID: The presence of .AB, .AU etc says you need nroff -ms But why even bother unless you plan to become an awkspert? > On Jun 28, 2023, at 11:03 AM, KenUnix wrote: > > Guys, > > It's been too long. What would I use to compile this man page source? > > I do remember some option switches are required. Yes? > > Thanks > > > On Wed, Jun 28, 2023 at 1:49 PM Adam Sampson wrote: > On Wed, Jun 28, 2023 at 09:26:02AM +0300, Aharon Robbins wrote: >> Attached is "A Supplemental Document For Awk". This circulated on >> USENET in the 80s. My copy is dated January 18, 1989, but I'm sure >> it's older than that. > > In the utzoo Usenet archive, there are two versions of this document and > a few mentions of it... > > John Pierce posted to comp.unix.questions on 1989-04-02, saying he'd > written it "four or five years ago". > > Stu Heiss, in comp.unix.questions on 1989-03-06, said it was "posted to > net.sources 18 Jun 86 with message-id 238 at sdchema.sdchem.uucp". > Unfortunately this isn't in the utzoo archive or the net.sources.mbox > in archive.org's Usenet Historical Collection. > > A copy identical to yours was posted by Jim Harkins to > comp.unix.questions on 1990-03-29. > > There's a later version, fixing a typo and some formatting and adding a > mention of \f and \b in printf, which was posted by Brian Kantor to > comp.doc on 1987-10-11 -- I've attached this. The same file (with two > .bps commented out) was reposted in comp.unix.questions on 1989-11-16 by > Francois-Michel Lang. > > Thanks, > > -- > Adam Sampson > > > -- > End of line > JOB TERMINATED -->> Okey Dokey, OK Boss From stuff at riddermarkfarm.ca Thu Jun 29 11:59:24 2023 From: stuff at riddermarkfarm.ca (Stuff Received) Date: Wed, 28 Jun 2023 21:59:24 -0400 Subject: [TUHS] Trying to date "A Supplemental Document For Awk" In-Reply-To: <20230628234750.GE43966@eureka.lemis.com> References: <20230628234750.GE43966@eureka.lemis.com> Message-ID: <237fe42c-7803-33cc-68c3-c9a05966b95a@riddermarkfarm.ca> On 2023-06-28 19:47, Greg 'groggy' Lehey wrote: > On Wednesday, 28 June 2023 at 14:38:40 -0400, Clem Cole wrote: [...] > > There's also grog (groff guess) that may help. It's not very clever, > but it recognizes a number of formats: > > $ grog ls.1 > groff -mdoc ls.1 Thank you -- I never knew of its existence. But what did people use before grog and why was the compilation line never placed in a comment in the file? N. > > Greg From tuhs at tuhs.org Thu Jun 29 16:27:44 2023 From: tuhs at tuhs.org (segaloco via TUHS) Date: Thu, 29 Jun 2023 06:27:44 +0000 Subject: [TUHS] Trying to date "A Supplemental Document For Awk" In-Reply-To: <237fe42c-7803-33cc-68c3-c9a05966b95a@riddermarkfarm.ca> References: <20230628234750.GE43966@eureka.lemis.com> <237fe42c-7803-33cc-68c3-c9a05966b95a@riddermarkfarm.ca> Message-ID: > But what did people use before grog and why was the compilation line > never placed in a comment in the file? The primary macro packages I see come up between Bell and UCB are man, ms, mm, and me. Man of course finds use in the manual pages (although there are different representations of manpages in nroff over time.) From what I've seen (someone who was there can surely correct me) it seems that ms macros were more commonly used on the research side of things while the mm macros proliferated more in the supported side. Finally the me macros were a BSD component. Given these separations, the origin of or relative vicinity from which a paper originates provides much context as to which macros may be present. To a finer point, the papers published with V7 are ms macros papers while the new additions in PWB lineages are mm macros, while some papers that crop up in BSD likely use me (although I haven't gotten too far into BSD with doc research yet.) Papers from UNIX consumers such as universities are likely in ms or me most of the time. On the flip side, mm was the macro package touted with Documenter's Workbench, so many commercial operations using System V for documentation would've produced documents in mm. I'd be curious whether the earlier "Phototypesetter" package included ms or mm (or both.) I don't think I've seen a "papers" set with both the Lesk ms document and the Smith and Mashey mm one, so couldn't say how common both in the same Bell offering were. Additionally, my research hasn't touched on any officially sanctioned use of mm in BSD, so that's an area ripe for some more study. As for other breadcrumbs, Bell mm macros papers do often include a comment at the top indicating to print with nroff -mm or mm(1). I don't recall seeing similar in research papers, but haven't necessarily gone looking. In any case, the paper sets with UNIX itself typically had scripts included with the necessary command-lines, as many papers additionally needed some eqn and/or tbl processing. I imagine any other such formally distributed document sources would likewise include scripts in lieu of commentary, but it depends. - Matt G. From andrew at humeweb.com Thu Jun 29 16:41:53 2023 From: andrew at humeweb.com (Andrew Hume) Date: Wed, 28 Jun 2023 23:41:53 -0700 Subject: [TUHS] Trying to date "A Supplemental Document For Awk" In-Reply-To: References: <20230628234750.GE43966@eureka.lemis.com> <237fe42c-7803-33cc-68c3-c9a05966b95a@riddermarkfarm.ca> Message-ID: over time, folks in research tended to use make (or its descendants) to generate paper outputs. altho i do recall a tool similar to grog that correctly orchestrated the ideal/pic/eqn/tbl/troff pipeline needed to generate the output. the order was important. as for macros, for several years we tended to use the pm macros (akin to the ms macros) because they drove chris van wyck and kernighan’s page balancing backend, which was necessary to produce print ready copy for journals etc. > On Jun 28, 2023, at 11:27 PM, segaloco via TUHS wrote: > >> But what did people use before grog and why was the compilation line >> never placed in a comment in the file? From noel.hunt at gmail.com Thu Jun 29 16:44:26 2023 From: noel.hunt at gmail.com (Noel Hunt) Date: Thu, 29 Jun 2023 16:44:26 +1000 Subject: [TUHS] Trying to date "A Supplemental Document For Awk" In-Reply-To: References: <20230628234750.GE43966@eureka.lemis.com> <237fe42c-7803-33cc-68c3-c9a05966b95a@riddermarkfarm.ca> Message-ID: And let us not forget the wonderful 'mv' macros, for typesetting over-head projection slides. From noel.hunt at gmail.com Thu Jun 29 16:45:42 2023 From: noel.hunt at gmail.com (Noel Hunt) Date: Thu, 29 Jun 2023 16:45:42 +1000 Subject: [TUHS] Trying to date "A Supplemental Document For Awk" In-Reply-To: References: <20230628234750.GE43966@eureka.lemis.com> <237fe42c-7803-33cc-68c3-c9a05966b95a@riddermarkfarm.ca> Message-ID: > altho i do recall a tool similar to grog that correctly orchestrated the ideal/pic/eqn/tbl/troff pipeline Perhaps you are referring to 'doctype'? From andrew at humeweb.com Thu Jun 29 16:48:26 2023 From: andrew at humeweb.com (Andrew Hume) Date: Wed, 28 Jun 2023 23:48:26 -0700 Subject: [TUHS] Trying to date "A Supplemental Document For Awk" In-Reply-To: References: <20230628234750.GE43966@eureka.lemis.com> <237fe42c-7803-33cc-68c3-c9a05966b95a@riddermarkfarm.ca> Message-ID: <9293CB0A-37CF-4966-9C93-81CFD5BC7EAB@humeweb.com> its possible; i simply can’t remember 40 years ago. > On Jun 28, 2023, at 11:45 PM, Noel Hunt wrote: > >> altho i do recall a tool similar to grog that correctly orchestrated the ideal/pic/eqn/tbl/troff pipeline > > Perhaps you are referring to 'doctype'? From arnold at skeeve.com Thu Jun 29 16:50:34 2023 From: arnold at skeeve.com (arnold at skeeve.com) Date: Thu, 29 Jun 2023 00:50:34 -0600 Subject: [TUHS] Trying to date "A Supplemental Document For Awk" In-Reply-To: <9293CB0A-37CF-4966-9C93-81CFD5BC7EAB@humeweb.com> References: <20230628234750.GE43966@eureka.lemis.com> <237fe42c-7803-33cc-68c3-c9a05966b95a@riddermarkfarm.ca> <9293CB0A-37CF-4966-9C93-81CFD5BC7EAB@humeweb.com> Message-ID: <202306290650.35T6oYmd009012@freefriends.org> It is doctype. It's still alive (as an rc/grep/awk) script in Plan 9 and descendants. Andrew Hume wrote: > its possible; i simply can’t remember 40 years ago. > > > On Jun 28, 2023, at 11:45 PM, Noel Hunt wrote: > > > >> altho i do recall a tool similar to grog that correctly orchestrated the ideal/pic/eqn/tbl/troff pipeline > > > > Perhaps you are referring to 'doctype'? > From arnold at skeeve.com Thu Jun 29 17:14:04 2023 From: arnold at skeeve.com (arnold at skeeve.com) Date: Thu, 29 Jun 2023 01:14:04 -0600 Subject: [TUHS] Bell Labs CSTRs Message-ID: <202306290714.35T7E4Qv016653@freefriends.org> Available at https://www.skeeve.com/bell-labs-cstrs.tar.gz Warren and Brantley and anyone else, feel free to retrieve. I have two sets - both are in the tarball so there are undoubtedly duplications. If someone else can curate them into single canonical set that'd be helpful, I just don't have the time right now. Enjoy, Arnold From noel.hunt at gmail.com Thu Jun 29 17:36:25 2023 From: noel.hunt at gmail.com (Noel Hunt) Date: Thu, 29 Jun 2023 17:36:25 +1000 Subject: [TUHS] Bell Labs CSTRs In-Reply-To: <202306290714.35T7E4Qv016653@freefriends.org> References: <202306290714.35T7E4Qv016653@freefriends.org> Message-ID: Many thanks. On Thu, 29 Jun 2023 at 17:14, wrote: > > Available at https://www.skeeve.com/bell-labs-cstrs.tar.gz > > Warren and Brantley and anyone else, feel free to retrieve. > > I have two sets - both are in the tarball so there are undoubtedly > duplications. If someone else can curate them into single canonical > set that'd be helpful, I just don't have the time right now. > > Enjoy, > > Arnold From g.branden.robinson at gmail.com Thu Jun 29 23:34:00 2023 From: g.branden.robinson at gmail.com (G. Branden Robinson) Date: Thu, 29 Jun 2023 08:34:00 -0500 Subject: [TUHS] Trying to date "A Supplemental Document For Awk" In-Reply-To: <20230628234750.GE43966@eureka.lemis.com> References: <20230628234750.GE43966@eureka.lemis.com> Message-ID: <20230629133400.d53treoxwyrxhnzi@illithid> At 2023-06-29T09:47:50+1000, Greg 'groggy' Lehey wrote: > There's also grog (groff guess) that may help. It's not very clever, > but it recognizes a number of formats: > > $ grog ls.1 > groff -mdoc ls.1 I won't claim that grog is more clever now, but as of groff 1.23.0 it is[1] avowedly less buggy. It is also 52% of its former size (by `wc -l`), has 14 bug fixes since groff 1.22.4 (with only a wish list item remaining), sports an automated test suite, and the tool itself can now be conveniently passed around as a single file--so I'm attaching it. Regards, Branden [1] Will be. We're up to release candidate 4 now. https://alpha.gnu.org/gnu/groff/ -------------- next part -------------- #!/usr/bin/perl # grog - guess options for groff command # Inspired by doctype script in Kernighan & Pike, Unix Programming # Environment, pp 306-8. # Copyright (C) 1993-2021 Free Software Foundation, Inc. # Written by James Clark. # Rewritten in Perl by Bernd Warken . # Hacked up by G. Branden Robinson, 2021. # This file is part of 'grog', which is part of 'groff'. # 'groff' is free software; you can redistribute it and/or modify it # under the terms of the GNU General Public License as published by # the Free Software Foundation, either version 2 of the License, or # (at your option) any later version. # 'groff' is distributed in the hope that it will be useful, but # WITHOUT ANY WARRANTY; without even the implied warranty of # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU # General Public License for more details. # You should have received a copy of the GNU General Public License # along with this program. If not, see # . use warnings; use strict; use File::Spec; my $groff_version = 'DEVELOPMENT'; my @command = (); # the constructed groff command my @requested_package = (); # arguments to '-m' grog options my @inferred_preprocessor = (); # preprocessors the document uses my @inferred_main_package = (); # full-service package(s) detected my $main_package; # full-service package we go with my $do_run = 0; # run generated 'groff' command my $use_compatibility_mode = 0; # is -C being passed to groff? my %preprocessor_for_macro = ( 'EQ', 'eqn', 'G1', 'grap', 'GS', 'grn', 'PS', 'pic', '[', 'refer', #'so', 'soelim', # Can't be inferred this way; see grog man page. 'TS', 'tbl', 'cstart', 'chem', 'lilypond', 'glilypond', 'Perl', 'gperl', 'pinyin', 'gpinyin', ); my $program_name = $0; { my ($v, $d, $f) = File::Spec->splitpath($program_name); $program_name = $f; } my %user_macro; my %score = (); my @input_file; # .TH is both a man(7) macro and often used with tbl(1). We expect to # find .TH in ms(7) documents only between .TS and .TE calls, and in # man(7) documents only as the first macro call. my $have_seen_first_macro_call = 0; # man(7) and ms(7) use many of the same macro names; do extra checking. my $man_score = 0; my $ms_score = 0; my $had_inference_problem = 0; my $had_processing_problem = 0; my $have_any_valid_arguments = 0; sub fail { my $text = shift; print STDERR "$program_name: error: $text\n"; $had_processing_problem = 1; } sub warn { my $text = shift; print STDERR "$program_name: warning: $text\n"; } sub process_arguments { my $no_more_options = 0; my $delayed_option = ''; my $was_minus = 0; my $optarg = 0; my $pdf_with_ligatures = 0; foreach my $arg (@ARGV) { if ( $optarg ) { push @command, $arg; $optarg = 0; next; } if ($no_more_options) { push @input_file, $arg; next; } if ($delayed_option) { if ($delayed_option eq '-m') { push @requested_package, $arg; $arg = ''; } else { push @command, $delayed_option; } push @command, $arg if $arg; $delayed_option = ''; next; } unless ( $arg =~ /^-/ ) { # file name, no opt, no optarg push @input_file, $arg; next; } # now $arg starts with '-' if ($arg eq '-') { unless ($was_minus) { push @input_file, $arg; $was_minus = 1; } next; } if ($arg eq '--') { $no_more_options = 1; next; } # Handle options that cause an early exit. &version() if ($arg eq '-v' || $arg eq '--version'); &usage(0) if ($arg eq '-h' || $arg eq '--help'); if ($arg =~ '^--.') { if ($arg =~ '^--(run|with-ligatures)$') { $do_run = 1 if ($arg eq '--run'); $pdf_with_ligatures = 1 if ($arg eq '--with-ligatures'); } else { &fail("unrecognized grog option '$arg'; ignored"); &usage(1); } next; } # Handle groff options that take an argument. # Handle the option argument being separated by whitespace. if ($arg =~ /^-[dfFIKLmMnoPrTwW]$/) { $delayed_option = $arg; next; } # Handle '-m' option without subsequent whitespace. if ($arg =~ /^-m/) { my $package = $arg; $package =~ s/-m//; push @requested_package, $package; next; } # Treat anything else as (possibly clustered) groff options that # take no arguments. # Our do_line() needs to know if it should do compatibility parsing. $use_compatibility_mode = 1 if ($arg =~ /C/); push @command, $arg; } if ($pdf_with_ligatures) { push @command, '-P-y'; push @command, '-PU'; } @input_file = ('-') unless (@input_file); } # process_arguments() sub process_input { foreach my $file (@input_file) { unless ( open(FILE, $file eq "-" ? $file : "< $file") ) { &fail("cannot open '$file': $!"); next; } $have_any_valid_arguments = 1; while (my $line = ) { chomp $line; &do_line($line); } close(FILE); } # end foreach } # process_input() # Push item onto inferred full-service list only if not already present. sub push_main_package { my $pkg = shift; if (!grep(/^$pkg/, @inferred_main_package)) { push @inferred_main_package, $pkg; } } # push_main_package() sub do_line { my $command; # request or macro name my $args; # request or macro arguments my $line = shift; # Check for a Perl Pod::Man comment. # # An alternative to this kludge is noted below: if a "standard" macro # is redefined, we could delete it from the relevant lists and # hashes. if ($line =~ /\\\" Automatically generated by Pod::Man/) { $man_score += 100; } # Strip comments. $line =~ s/\\".*//; $line =~ s/\\#.*// unless $use_compatibility_mode; return unless ($line =~ /^[.']/); # Ignore text lines. # Perform preprocessor checks; they scan their inputs using a rump # interpretation of roff(7) syntax that requires the default control # character and no space between it and the macro name. In AT&T # compatibility mode, no space (or newline!) is required after the # macro name, either. We mimic the preprocessors themselves; eqn(1), # for instance, does not recognize '.EN' if '.EQ' has not been seen. my $boundary = '\\b'; $boundary = '' if ($use_compatibility_mode); if ($line =~ /^\.(\S\S)$boundary/ || $line =~ /^\.(\[)/) { my $macro = $1; # groff identifiers can have extremely weird characters in them. # The ones we care about are conventionally named, but me(7) # documents can call macros like '+c', so quote carefully. if (grep(/^\Q$macro\E$/, keys %preprocessor_for_macro)) { my $preproc = $preprocessor_for_macro{$macro}; if (!grep(/$preproc/, @inferred_preprocessor)) { push @inferred_preprocessor, $preproc; } } } # Normalize control lines; convert no-break control character to the # regular one and remove unnecessary whitespace. $line =~ s/^['.]\s*/./; $line =~ s/\s+$//; return if ($line =~ /^\.$/); # Ignore empty request. return if ($line =~ /^\.\\?\.$/); # Ignore macro definition ends. # Split control line into a request or macro call and its arguments. # Handle single-letter macro names. if ($line =~ /^\.(\S)(\s+(.*))?$/) { $command = $1; $args = $2; # Handle two-letter macro/request names in compatibility mode. } elsif ($use_compatibility_mode) { $line =~ /^\.(\S\S)\s*(.*)$/; $command = $1; $args = $2; # Handle multi-letter macro/request names in groff mode. } else { $line =~ /^\.(\S+)(\s+(.*))?$/; $command = $1; $args = $3; } $command = '' unless ($command); $args = '' unless ($args); ###################################################################### # user-defined macros # If the line calls a user-defined macro, skip it. return if (exists $user_macro{$command}); # These are all requests supported by groff 1.23.0. my @request = ('ab', 'ad', 'af', 'aln', 'als', 'am', 'am1', 'ami', 'ami1', 'as', 'as1', 'asciify', 'backtrace', 'bd', 'blm', 'box', 'boxa', 'bp', 'br', 'brp', 'break', 'c2', 'cc', 'ce', 'cf', 'cflags', 'ch', 'char', 'chop', 'class', 'close', 'color', 'composite', 'continue', 'cp', 'cs', 'cu', 'da', 'de', 'de1', 'defcolor', 'dei', 'dei1', 'device', 'devicem', 'di', 'do', 'ds', 'ds1', 'dt', 'ec', 'ecr', 'ecs', 'el', 'em', 'eo', 'ev', 'evc', 'ex', 'fam', 'fc', 'fchar', 'fcolor', 'fi', 'fp', 'fschar', 'fspecial', 'ft', 'ftr', 'fzoom', 'gcolor', 'hc', 'hcode', 'hla', 'hlm', 'hpf', 'hpfa', 'hpfcode', 'hw', 'hy', 'hym', 'hys', 'ie', 'if', 'ig', 'in', 'it', 'itc', 'kern', 'lc', 'length', 'linetabs', 'lf', 'lg', 'll', 'lsm', 'ls', 'lt', 'mc', 'mk', 'mso', 'msoquiet', 'na', 'ne', 'nf', 'nh', 'nm', 'nn', 'nop', 'nr', 'nroff', 'ns', 'nx', 'open', 'opena', 'os', 'output', 'pc', 'pev', 'pi', 'pl', 'pm', 'pn', 'pnr', 'po', 'ps', 'psbb', 'pso', 'ptr', 'pvs', 'rchar', 'rd', 'return', 'rfschar', 'rj', 'rm', 'rn', 'rnn', 'rr', 'rs', 'rt', 'schar', 'shc', 'shift', 'sizes', 'so', 'soquiet', 'sp', 'special', 'spreadwarn', 'ss', 'stringdown', 'stringup', 'sty', 'substring', 'sv', 'sy', 'ta', 'tc', 'ti', 'tkf', 'tl', 'tm', 'tm1', 'tmc', 'tr', 'trf', 'trin', 'trnt', 'troff', 'uf', 'ul', 'unformat', 'vpt', 'vs', 'warn', 'warnscale', 'wh', 'while', 'write', 'writec', 'writem'); # Add user-defined macro names to %user_macro. # # Macros can also be defined with .dei{,1}, ami{,1}, but supporting # that would be a heavy lift for the benefit of users that probably # don't require grog's help. --GBR if ($command =~ /^(de|am)1?$/) { my $name = $args; # Strip off any end macro. $name =~ s/\s+.*$//; # Handle special cases of macros starting with '[' or ']'. if ($name =~ /^[][]/) { delete $preprocessor_for_macro{'['}; } # XXX: If the macro name shadows a standard macro name, maybe we # should delete the latter from our lists and hashes. This might # depend on whether the document is trying to remain compatible # with an existing interface, or simply colliding with names they # don't care about (consider a raw roff document that defines 'PP'). # --GBR $user_macro{$name} = 0 unless (exists $user_macro{$name}); return; } # XXX: Handle .rm as well? # Ignore all other requests. Again, macro names can contain Perl # regex metacharacters, so be careful. return if (grep(/^\Q$command\E$/, @request)); # What remains must be a macro name. my $macro = $command; $have_seen_first_macro_call = 1; $score{$macro}++; ###################################################################### # macro package (tmac) ###################################################################### # man and ms share too many macro names for the following approach to # be fruitful for many documents; see &infer_man_or_ms_package. # # We can put one thumb on the scale, however. if ((!$have_seen_first_macro_call) && ($macro eq 'TH')) { # TH as the first call in a document screams man(7). $man_score += 100; } ########## # mdoc if ($macro =~ /^Dd$/) { &push_main_package('doc'); return; } ########## # old mdoc if ($macro =~ /^(Tp|Dp|De|Cx|Cl)$/) { &push_main_package('doc-old'); return; } ########## # me if ($macro =~ /^( [ilnp]p| n[12]| sh )$/x) { &push_main_package('e'); return; } ############# # mm and mmse if ($macro =~ /^( H| MULB| LO| LT| NCOL| PH| SA )$/x) { if ($macro =~ /^LO$/) { if ( $args =~ /^(DNAMN|MDAT|BIL|KOMP|DBET|BET|SIDOR)/ ) { &push_main_package('mse'); return; } } elsif ($macro =~ /^LT$/) { if ( $args =~ /^(SVV|SVH)/ ) { &push_main_package('mse'); return; } } &push_main_package('m'); return; } ########## # mom if ($macro =~ /^( ALD| AUTHOR| CHAPTER_TITLE| CHAPTER| COLLATE| DOCHEADER| DOCTITLE| DOCTYPE| DOC_COVER| FAMILY| FAM| FT| LEFT| LL| LS| NEWPAGE| NO_TOC_ENTRY| PAGENUMBER| PAGE| PAGINATION| PAPER| PRINTSTYLE| PT_SIZE| START| TITLE| TOC_AFTER_HERE TOC| T_MARGIN| )$/x) { &push_main_package('om'); return; } } # do_line() my @preprocessor = (); sub infer_preprocessors { my %option_for_preprocessor = ( 'eqn', '-e', 'grap', '-G', 'grn', '-g', 'pic', '-p', 'refer', '-R', #'soelim', '-s', # Can't be inferred this way; see grog man page. 'tbl', '-t', 'chem', '-j' ); # Use a temporary list we can sort later. We want the options to show # up in a stable order for testing purposes instead of the order their # macros turn up in the input. groff doesn't care about the order. my @opt = (); foreach my $preproc (@inferred_preprocessor) { my $preproc_option = $option_for_preprocessor{$preproc}; if ($preproc_option) { push @opt, $preproc_option; } else { push @preprocessor, $preproc; } } push @command, sort @opt; } # infer_preprocessors() # Return true (1) if either the man or ms package is inferred. sub infer_man_or_ms_package { my @macro_ms = ('RP', 'TL', 'AU', 'AI', 'DA', 'ND', 'AB', 'AE', 'QP', 'QS', 'QE', 'XP', 'NH', 'R', 'CW', 'BX', 'UL', 'LG', 'NL', 'KS', 'KF', 'KE', 'B1', 'B2', 'DS', 'DE', 'LD', 'ID', 'BD', 'CD', 'RD', 'FS', 'FE', 'OH', 'OF', 'EH', 'EF', 'P1', 'TA', '1C', '2C', 'MC', 'XS', 'XE', 'XA', 'TC', 'PX', 'IX', 'SG'); my @macro_man = ('BR', 'IB', 'IR', 'RB', 'RI', 'P', 'TH', 'TP', 'SS', 'HP', 'PD', 'AT', 'UC', 'SB', 'EE', 'EX', 'OP', 'MT', 'ME', 'SY', 'YS', 'TQ', 'UR', 'UE'); my @macro_man_or_ms = ('B', 'I', 'BI', 'DT', 'RS', 'RE', 'SH', 'SM', 'IP', 'LP', 'PP'); for my $key (@macro_man_or_ms, @macro_man, @macro_ms) { $score{$key} = 0 unless exists $score{$key}; } # Compute a score for each package by counting occurrences of their # characteristic macros. foreach my $key (@macro_man_or_ms) { $man_score += $score{$key}; $ms_score += $score{$key}; } foreach my $key (@macro_man) { $man_score += $score{$key}; } foreach my $key (@macro_ms) { $ms_score += $score{$key}; } if (!$ms_score && !$man_score) { # The input may be a "raw" roff document; this is not a problem, # but it does mean no package was inferred. return 0; } elsif ($ms_score == $man_score) { # If there was no TH call, it's not a (valid) man(7) document. if (!$score{'TH'}) { &push_main_package('s'); } else { &warn("document ambiguous; disambiguate with -man or -ms option"); $had_inference_problem = 1; } return 0; } elsif ($ms_score > $man_score) { &push_main_package('s'); } else { &push_main_package('an'); } return 1; } # infer_man_or_ms_package() sub construct_command { my @main_package = ('an', 'doc', 'doc-old', 'e', 'm', 'om', 's'); my $file_args_included; # file args now only at 1st preproc unshift @command, 'groff'; if (@preprocessor) { my @progs; $progs[0] = shift @preprocessor; push(@progs, @input_file); for (@preprocessor) { push @progs, '|'; push @progs, $_; } push @progs, '|'; unshift @command, @progs; $file_args_included = 1; } else { $file_args_included = 0; } foreach (@command) { next unless /\s/; # when one argument has several words, use accents $_ = "'" . $_ . "'"; } my $have_ambiguous_main_package = 0; my $inferred_main_package_count = scalar @inferred_main_package; # Did we infer multiple full-service packages? if ($inferred_main_package_count > 1) { $have_ambiguous_main_package = 1; # For each one the user explicitly requested... for my $pkg (@requested_package) { # ...did it resolve the ambiguity for us? if (grep(/$pkg/, @inferred_main_package)) { @inferred_main_package = ($pkg); $have_ambiguous_main_package = 0; last; } } } elsif ($inferred_main_package_count == 1) { $main_package = shift @inferred_main_package; } if ($have_ambiguous_main_package) { # TODO: Alphabetical is probably not the best ordering here. We # should tally up scores on a per-package basis generally, not just # for an and s. for my $pkg (@main_package) { if (grep(/$pkg/, @inferred_main_package)) { $main_package = $pkg; &warn("document ambiguous (choosing '$main_package'" . " from '@inferred_main_package'); disambiguate with -m" . " option"); $had_inference_problem = 1; last; } } } # If a full-service package was explicitly requested, warn if the # inference differs from the request. This also ensures that all -m # arguments are placed in the same order that the user gave them; # caveat dictator. my @auxiliary_package_argument = (); for my $pkg (@requested_package) { my $is_auxiliary_package = 1; if (grep(/$pkg/, @main_package)) { $is_auxiliary_package = 0; if ($pkg ne $main_package) { &warn("overriding inferred package '$main_package'" . " with requested package '$pkg'"); $main_package = $pkg; } } if ($is_auxiliary_package) { push @auxiliary_package_argument, "-m" . $pkg; } } push @command, '-m' . $main_package if ($main_package); push @command, @auxiliary_package_argument; push @command, @input_file unless ($file_args_included); ######### # execute the 'groff' command here with option '--run' if ( $do_run ) { # with --run print STDERR "@command\n"; my $cmd = join ' ', @command; system($cmd); } else { print "@command\n"; } } # construct_command() sub usage { my $stream = *STDOUT; my $had_error = shift; $stream = *STDERR if $had_error; my $grog = $program_name; print $stream "usage: $grog [--ligatures] [--run]" . " [groff-option ...] [--] [file ...]\n" . "usage: $grog {-v | --version}\n" . "usage: $grog {-h | --help}\n"; unless ($had_error) { print $stream "\n" . "Read each roff(7) input FILE and attempt to infer an appropriate\n" . "groff(1) command to format it. See the grog(1) manual page.\n"; } exit $had_error; } sub version { print "GNU $program_name (groff) $groff_version\n"; exit 0; } # version() # initialize my $in_unbuilt_source_tree = 0; { my $at = '@'; $in_unbuilt_source_tree = 1 if ('1.23.0.rc4.391-325a' eq "${at}VERSION${at}"); } $groff_version = '1.23.0.rc4.391-325a' unless ($in_unbuilt_source_tree); &process_arguments(); &process_input(); if ($have_any_valid_arguments) { &infer_preprocessors(); &infer_man_or_ms_package() if (scalar @inferred_main_package != 1); &construct_command(); } exit 2 if ($had_processing_problem); exit 1 if ($had_inference_problem); exit 0; # Local Variables: # fill-column: 72 # mode: CPerl # End: # vim: set cindent noexpandtab shiftwidth=2 softtabstop=2 textwidth=72: -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: not available URL: From g.branden.robinson at gmail.com Thu Jun 29 23:45:50 2023 From: g.branden.robinson at gmail.com (G. Branden Robinson) Date: Thu, 29 Jun 2023 08:45:50 -0500 Subject: [TUHS] Trying to date "A Supplemental Document For Awk" In-Reply-To: <237fe42c-7803-33cc-68c3-c9a05966b95a@riddermarkfarm.ca> References: <20230628234750.GE43966@eureka.lemis.com> <237fe42c-7803-33cc-68c3-c9a05966b95a@riddermarkfarm.ca> Message-ID: <20230629134550.mpffmezmm3xj2hrp@illithid> At 2023-06-28T21:59:24-0400, Stuff Received wrote: > and why was the compilation line never placed in a comment in the > file? Having done some work with historical *roff documents, my conjecture is that the single source of truth was usually to be found in a Makefile. Unfortunately, *roff documents have not reliably been distributed along with the scripts directing control of their compilation and installation. If you insist upon that, you start sounding like one of those street-corner preaching copyleft people... ;-) Regards, Branden -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: not available URL: From rich.salz at gmail.com Thu Jun 29 23:47:12 2023 From: rich.salz at gmail.com (Rich Salz) Date: Thu, 29 Jun 2023 09:47:12 -0400 Subject: [TUHS] Trying to date "A Supplemental Document For Awk" In-Reply-To: <20230629133400.d53treoxwyrxhnzi@illithid> References: <20230628234750.GE43966@eureka.lemis.com> <20230629133400.d53treoxwyrxhnzi@illithid> Message-ID: A perl script to inuit likely roff options as definitely a neat Unix hack. -------------- next part -------------- An HTML attachment was scrubbed... URL: From g.branden.robinson at gmail.com Fri Jun 30 00:02:23 2023 From: g.branden.robinson at gmail.com (G. Branden Robinson) Date: Thu, 29 Jun 2023 09:02:23 -0500 Subject: [TUHS] Trying to date "A Supplemental Document For Awk" In-Reply-To: References: <20230628234750.GE43966@eureka.lemis.com> <237fe42c-7803-33cc-68c3-c9a05966b95a@riddermarkfarm.ca> Message-ID: <20230629140223.gg4mgvsd7aigbj4e@illithid> At 2023-06-29T06:27:44+0000, segaloco via TUHS wrote: > Man of course finds use in the manual pages (although there are > different representations of manpages in nroff over time.) Setting aside the well known bifurcation between man(7) and mdoc(7), which manage to stay out of each other's way in the macro name space, I'm not aware of any comparative survey of different man(7) implementations. Ultrix at some point--I have no insight into the chronology of it--had a large set of extensions that remains quietly documented and supported by groff to this day, albeit off in a corner where it seems to receive little attention. (Just as well, in my opinion, as not all of its innovations are worthy of embrace.) As far as other vendor extensions and developments go, I have collected all of the information known to me into the groff_man(7) page in the any-minute-now groff 1.23.0 release. Here are the relevant sections. (There are two because concept and implementation are distinguishable.) History M. Douglas McIlroy designed, implemented, and documented the AT&T man macros for Unix Version 7 (1979) and employed them to edit the first volume of its Programmer's Manual, a compilation of all man pages supplied by the system. That man supported the macros listed in this page not described as extensions, except .P and the deprecated .AT and .UC. The only strings defined were R and S; no registers were documented. .UC appeared in 3BSD (1980). Unix System III (1980) introduced .P and exposed the registers IN and LL, which had been internal to Seventh Edition Unix man. PWB/UNIX 2.0 (1980) added the Tm string. 4BSD (1980) added lq and rq strings. SunOS 2.0 (1985) recognized C, D, P, and X registers. 4.3BSD (1986) added .AT and .P. Ninth Edition Research Unix (1986) introduced .EX and .EE. SunOS 4.0 (1988) added .SB. The foregoing features were what James Clark implemented in early versions of groff. Later, groff 1.20 (2009) originated .SY/.YS, .TQ, .MT/.ME, and .UR/.UE. Plan 9 from User Space's troff introduced .MR in 2020. Authors The initial GNU implementation of the man macro package was written by James Clark. Later, Werner Lemberg supplied the S, LT, and cR registers, the last a 4.3BSD-Reno mdoc(7) feature. Larry Kollar added the FT, HY, and SN registers; the HF string; and the PT and BT macros. G. Branden Robinson implemented the AD and MF strings; CS, CT, and U registers; and the MR macro. Except for .SB, the extension macros were written by Lemberg, Eric S. Raymond, and Robinson. This document was originally written for the Debian GNU/Linux system by Susan G. Kleinmann. It was corrected and updated by Lemberg and Robinson. The extension macros were documented by Raymond and Robinson. I welcome any further insights people can offer. This man page isn't the best place to document extensions that withered on the vine (like Eighth/Ninth Edition Research Unix's addition of multi-column macros for man(7)), but I wouldn't mind collecting such things into some sort of auxiliary article. While the mandoc(1)/mdocml project's "History of UNIX Manpages"[1] is an invaluable resource, it doesn't really do what's written on the tin, and serves more as a history of (some) *roff _formatters_--not of the man(7) language. I assume that this stance is in part due to the unease bordering on antipathy that mandoc(1) proponents have for the man(7) macro package. In their view, everybody should be writing mdoc(7). Unfortunately this lacuna has left useful historical information about the man(7) package uncollected. Regards, Branden [1] https://manpages.bsd.lv/history.html -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: not available URL: From clemc at ccc.com Fri Jun 30 00:40:36 2023 From: clemc at ccc.com (Clem Cole) Date: Thu, 29 Jun 2023 10:40:36 -0400 Subject: [TUHS] Bell Labs CSTRs In-Reply-To: References: <202306290714.35T7E4Qv016653@freefriends.org> Message-ID: +1 👍 ᐧ On Thu, Jun 29, 2023 at 3:37 AM Noel Hunt wrote: > Many thanks. > > On Thu, 29 Jun 2023 at 17:14, wrote: > > > > Available at https://www.skeeve.com/bell-labs-cstrs.tar.gz > > > > Warren and Brantley and anyone else, feel free to retrieve. > > > > I have two sets - both are in the tarball so there are undoubtedly > > duplications. If someone else can curate them into single canonical > > set that'd be helpful, I just don't have the time right now. > > > > Enjoy, > > > > Arnold > -------------- next part -------------- An HTML attachment was scrubbed... URL: From will.senn at gmail.com Fri Jun 30 01:04:35 2023 From: will.senn at gmail.com (Will Senn) Date: Thu, 29 Jun 2023 10:04:35 -0500 Subject: [TUHS] Bell Labs CSTRs In-Reply-To: References: <202306290714.35T7E4Qv016653@freefriends.org> Message-ID: Clem's +1 caught my attention, so I looked into the referenced docs. I saw the rather simple (conceptually) m6 processor described in tech note 54. I like its understandable. Why is it called m6? Just curious. Will On 6/29/23 09:40, Clem Cole wrote: > +1 👍 > ᐧ > > On Thu, Jun 29, 2023 at 3:37 AM Noel Hunt wrote: > > Many thanks. > > On Thu, 29 Jun 2023 at 17:14, wrote: > > > > Available at https://www.skeeve.com/bell-labs-cstrs.tar.gz > > > > Warren and Brantley and anyone else, feel free to retrieve. > > > > I have two sets - both are in the tarball so there are undoubtedly > > duplications.  If someone else can curate them into single canonical > > set that'd be helpful, I just don't have the time right now. > > > > Enjoy, > > > > Arnold > -------------- next part -------------- An HTML attachment was scrubbed... URL: From will.senn at gmail.com Fri Jun 30 01:14:30 2023 From: will.senn at gmail.com (Will Senn) Date: Thu, 29 Jun 2023 10:14:30 -0500 Subject: [TUHS] Bell Labs CSTRs In-Reply-To: References: <202306290714.35T7E4Qv016653@freefriends.org> Message-ID: On a related note, I just read cstr 99 - Bell's computing research history and one of Doug's early articles was mentioned: M. D. McIlroy, "Macro Instruction Extension of Compiler Languages," Communications of the ACM 3 (April 1960), pp. 214-220. It's discussing the general extensibility that macros provide and I was interested to obtain a copy to read at leisure. I found it over on ACM's digital library: https://dl.acm.org/doi/pdf/10.1145/367177.367223 But, the copy's not that great on my deteriorating eyesight. Does anybody have a cleaner copy? Lately, I've been vastly improving my vi/vim skills and part of that process is shifting from an ad-hoc process to a move, act, repeat mentality (thank you Drew Neil for that revelation) and macros are consonant with this line of thinking :). Will On 6/29/23 09:40, Clem Cole wrote: > +1 👍 > ᐧ > > On Thu, Jun 29, 2023 at 3:37 AM Noel Hunt wrote: > > Many thanks. > > On Thu, 29 Jun 2023 at 17:14, wrote: > > > > Available at https://www.skeeve.com/bell-labs-cstrs.tar.gz > > > > Warren and Brantley and anyone else, feel free to retrieve. > > > > I have two sets - both are in the tarball so there are undoubtedly > > duplications.  If someone else can curate them into single canonical > > set that'd be helpful, I just don't have the time right now. > > > > Enjoy, > > > > Arnold > -------------- next part -------------- An HTML attachment was scrubbed... URL: From steve at quintile.net Fri Jun 30 02:02:02 2023 From: steve at quintile.net (Steve Simon) Date: Thu, 29 Jun 2023 17:02:02 +0100 Subject: [TUHS] troff doc discovery Message-ID: <9FB0FDE6-6EF1-42F9-A200-FDAA916C0525@quintile.net> An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: favicon.ico Type: image/x-icon Size: 318 bytes Desc: not available URL: From steffen at sdaoden.eu Fri Jun 30 05:03:12 2023 From: steffen at sdaoden.eu (Steffen Nurpmeso) Date: Thu, 29 Jun 2023 21:03:12 +0200 Subject: [TUHS] Trying to date "A Supplemental Document For Awk" In-Reply-To: References: <20230628234750.GE43966@eureka.lemis.com> <20230629133400.d53treoxwyrxhnzi@illithid> Message-ID: <20230629190312.OzHE3%steffen@sdaoden.eu> Rich Salz wrote in : |A perl script to inuit likely roff options as definitely a neat Unix hack. The "problem" is that the "shebang" line used for UNIX man'uals on at least a few ("newer" <> post Y2K) systems has never been extended in plain *roff terms, for general macro things. Ie that For example, newer man(1)s read the first line of the manual and check for a syntax <^'\" >followed by concat of [egprtv]+ (and in fact *join in* $MANROFFSEQ environment [egprtv]+) while getopts 'egprtv' preproc_arg; do case "${preproc_arg}" in e) pipeline="$pipeline | $EQN" ;; g) GRAP ;; # Ignore for compatibility. p) pipeline="$pipeline | $PIC" ;; r) pipeline="$pipeline | $REFER" ;; t) pipeline="$pipeline | $TBL" ;; v) pipeline="$pipeline | $VGRIND" ;; *) usage ;; esac Of course, most roff's do not have that "super process" that groff actually is, for one, so you have to formulate pipelines anyway. And then roff is dead for the young. Generally speaking. It is only a pity in my opinion because the most widely used implementation (GNU roff) actually does "magic" already and anyway, namely in its preconv(1), which does preconv tries to find the input encoding with the following algorithm. ... 2. Otherwise, check whether the input starts with a Byte Order Mark (BOM, see below). If found, use it. 3. Otherwise, check whether there is a known coding tag (see below) in either the first or second input line. If found, use it. ... 5. If everything fails[.] And 3. is then [.]supports the coding tag convention (with some restrictions) as used by GNU Emacs and XEmacs[.] ... .\" -*- mode: troff; coding: latin-2 -*- But possibly the future brings not only integrative and truthful western white men, but also a roff which "can". The former i doubt, the latter i can still hope for. --steffen | |Der Kragenbaer, The moon bear, |der holt sich munter he cheerfully and one by one |einen nach dem anderen runter wa.ks himself off |(By Robert Gernhardt) From steffen at sdaoden.eu Fri Jun 30 05:05:08 2023 From: steffen at sdaoden.eu (Steffen Nurpmeso) Date: Thu, 29 Jun 2023 21:05:08 +0200 Subject: [TUHS] troff doc discovery In-Reply-To: <9FB0FDE6-6EF1-42F9-A200-FDAA916C0525@quintile.net> References: <9FB0FDE6-6EF1-42F9-A200-FDAA916C0525@quintile.net> Message-ID: <20230629190508.sJ_9A%steffen@sdaoden.eu> Steve Simon wrote in <9FB0FDE6-6EF1-42F9-A200-FDAA916C0525 at quintile.net>: |i always liked the approach of the plan9 script, bayesian in its approach \ |(perhaps). | |it is also admirably short. Looking at #?0|kent:9front.git$ git show origin/front:rc/bin/doctype it seems to me a UNIX/POSIX sh(1) with getopt plus the same grep|sort|awk pipe is not outperformed here. --steffen | |Der Kragenbaer, The moon bear, |der holt sich munter he cheerfully and one by one |einen nach dem anderen runter wa.ks himself off |(By Robert Gernhardt) From noel.hunt at gmail.com Fri Jun 30 08:15:32 2023 From: noel.hunt at gmail.com (Noel Hunt) Date: Fri, 30 Jun 2023 08:15:32 +1000 Subject: [TUHS] Jerq menuhit/mhit Message-ID: The standard routine for drawing menus on the jerq was 'menuhit'. Items in the menu were centered, and the menu was scrollable when a certain threshold number of items was reached, and in addition, when the mouse pointer was in the top (bottom) item of the menu and it was possible to scroll in the appropriate direction, the menu was scrolled up or down 1 line. The structure associated with these menus is 'Menu'. There was however another menu-drawing routine, 'mhit', the menus drawn by this being hierarchical, the structure NMenu, which no longer contained text strings but an array of NItems. NMenus also had provision for 'help' text to be displayed, a simple string displayed on the screen, when button 1 was pressed while an entry in the menu was selected. The earliest version in the Eight Edition jerq code, also has one function in the NMenu structure which is called when the mouse pointer invokes a hierarchical menu. By Ninth Edition this has been expanded, with 3 functions, one as above, one invoked when an item is selected ('hit') and one when a hierarchical menu is exited. In the jerq code directories, under 'lib/jj', is a small 'ms' document, 'A Library of Goo for the 5620', which lists routines available in the library, and their authors. Andrew Hume is listed as the author of 'mhit'. Are there examples of code using these three menu functions ('dfn', 'hfn', 'bfn')? There seems to have been little interest in hierarchical menus at the labs, their use was quite limited. I found a program in the Tenth Edition archive, 'bubble' (which seems to be a program for displaying the three-dimensional structure of molecules) which uses them. 'samuel' made heavy use of them, including use of the 'hit' function, and Tom Cargill used basically the same code in 'pads' wherein the routine was called 'scripthit'. The plain 'menuhit' survived into Plan9, but as far as I know, it is only used by 'sam'. From andrew at humeweb.com Fri Jun 30 08:22:47 2023 From: andrew at humeweb.com (Andrew Hume) Date: Thu, 29 Jun 2023 15:22:47 -0700 Subject: [TUHS] Jerq menuhit/mhit In-Reply-To: References: Message-ID: <88993DC3-99D0-4977-A410-2BCAED781245@humeweb.com> i remember mhit well. generally, most folks thought the user interface had gone wrong if you needed to handle such large lists in a menu. so there was a mild cultural prejudice against such things. however, i needed it for a couple of projects, including circuit layout software. you can imagine selecting chips from such a menu and so on. > On Jun 29, 2023, at 3:15 PM, Noel Hunt wrote: > ... > The earliest version in the Eight Edition jerq code, also has > one function in the NMenu structure which is called when the > mouse pointer invokes a hierarchical menu. By Ninth Edition > this has been expanded, with 3 functions, one as above, one > invoked when an item is selected ('hit') and one when a > hierarchical menu is exited. > > In the jerq code directories, under 'lib/jj', is a small 'ms' > document, 'A Library of Goo for the 5620', which lists > routines available in the library, and their authors. Andrew > Hume is listed as the author of 'mhit'. > > Are there examples of code using these three menu functions > ('dfn', 'hfn', 'bfn')? From arnold at skeeve.com Fri Jun 30 17:45:28 2023 From: arnold at skeeve.com (arnold at skeeve.com) Date: Fri, 30 Jun 2023 01:45:28 -0600 Subject: [TUHS] troff doc discovery In-Reply-To: <9FB0FDE6-6EF1-42F9-A200-FDAA916C0525@quintile.net> References: <9FB0FDE6-6EF1-42F9-A200-FDAA916C0525@quintile.net> Message-ID: <202306300745.35U7jSOF026761@freefriends.org> Steve Simon wrote: > i always liked the approach of the plan9 script, bayesian in its approach (perhaps).it is also admirably short.-Steve /usr/web/sources/plan9/rc/bin/doctype - Plan 9 from Bell Labs9p.io Hi Steve. I'm pretty sure it's a translation of the same script from Research Unix. I think BWK wrote it originally. As a favor, please send text to the list in addition to HTML. Thanks, Arnold