# From: ianf@random.se (Ian Feldman, Keepers of The Setext Flame[tm]) # Newsgroups: alt.hypertext (complete, original headers at end of file) # Date: Fri, 23 Apr 93 07:53:50 +0200 # Message-ID: # X-URL: file://garbo.uwasa.fi/mac/tidbits/setext/setext+sgml_01.etx # Reply-To: setext-list-request@random.se # Organization: random design -- "Opinions, cheaply" # Lines: 349 # Summary: setext is to plaintext as RTF is to RTFM # Subject: Re: Looking for Electronic Publshing formats... [long] SGML vs setext ================ by Ian Feldman_ Having fathered and mothered_ setext, the structure-enhanced text markup method designed for use primarily by _smaller_ periodic online publications, I feel compelled to clarify certain miscon- ceptions in regard to in this forum expressed doubts as to its usability as an electronic hyper?text interchange format. Please observe the ambiguity of the subject of this debate: the original query_ was about "electronic formats for printed materials" for deployment in a multi-format browser using "Amiga's system of DataTypes to provide content-independent methods of viewing data" (both quotes author_ verbatim). In time the discussion has come to be centered around SGML's alleged superiority, inevitability and, to a lesser extent, of the setext being or not being a viable solution for online-distributed matter. Having read just the basic introductory document about it, meant to provide the public at large with an easily-palatable foundation, Eliot Kimber_ of IBM has declared_ it to be "a very primitive, obviously easy to implement and interchange." Admittedly, limited it may be, but 'primitive'? Anything judged through the prism of the SGML will by definition appear primitive (although the setext ALSO readable) to the naked eye. In contrast, SGML et al judged through the bias of human-readable- text/ ASCII will appear unduly complex and mostly inaccessible to anyone having but the lowest common denominator hardware/ software at their disposal (80% of all users? 90%?) Sure, everybody should have a Corvette... er, a SparcStation_ I mean, but as long as not everybody does we might just as well judge the setext on its own merits. Eliot Kimber_ has many interesting things to say about the SGML, data-notations_limits_ and markup methods in general, any of which I couldn't agree more fully with. However, he also seems oblivious to the loopsided logic present in this his advocated solution_ (here taken out of context but not misrepresentative of the whole): > simply add a layer between the source and the presentation > system that translates the SGML source into setext dynamically: > SGML Source --> SGML2SETEXT --> setext --> setext viewer It strikes me as no little ironic that in order to view enhanced plaintext (i.e. the setext) in a basic-structured manner, say an outline of the submitted text, one would have to first encode it with SGML, then pipe it through a filter with a DTD acronym thrown in for a good measure. I'd have thought that, if setext is deemed adequate for some particular job, then surely it wouldn't have to be arrived at via the SGML-encoding route. In fact, and if I may contribute something of a truly-heretic nature, I'd have thought that the opposite would be an altogether more-agreeable solution: plaintext --> setext --> setext2SGML --> SGML viewer Obviously, Kimber has all the resources at his beck and call and expects that others will have them too. We may all yearn to become 1Mbit/sec-access high-flyers of the Internet, but in the meantime many of us have to make do with but Have-A-Mac and never enough funding to equip it with enough RAM to satisfy our needs. setext in multimedia ---------------------- The originator of this debate, Greg R Block_ further had this_ to say: > : For the moment, setext appears to me to be the most practical > : (universal, useable, general consumption) standard around for > : textual documents. > But ONLY for textual documents, and that is where part of the > problem lies. SGML's advantage is that it can structure things in > definite ways, and embed things that are not necessarily text. Let me respectfully suggest that anyone claiming that setext's use at best starts and ends with ASCII text had obviously not done their homework. Those of you familiar with the NewsGrazer_ newsreader on the NeXT may recall that the data format there employed is that of uuencoded richtext article _prepended_ by ASCII version of the text of same. This enables it to propagate normally along the net, display as richtext on other NeXTs and the relevant, _top_portion_of_it_, in plain elsewhere. Had the ASCII portions of it been setextized it'd allow it to provide an additional, more universally parseable, dimension of structure. So _potentially_ setext is as valid an encapsulation method for distributed-multimedial use as may be the RTF, SGML and the others. But unlike the others the text content of it will ALWAYS remain readable to the unaided eye while still offering limited --but hardly "small"-- amounts of extractable structure. Nor has potential for use of setext in hypertext been overlooked. While arguably providing only one dedicated tag for linking of (text) elements_, the concept that it follows resembles closely the format employed by WorldWideWeb's email_server_ (unbeknownst to one another, the WWW team and I have arrived at similar solutions of verbose anchors in text referenced by expanded URLs_ or comments at _end_ of documents). In this fashion even when viewed in unenhanced state the "administrivial" linkage data need not encroach upon the content of the document itself. Philippe-Andre Prindeville_ adds_ this: > SGML allows one to put wrappers on data-types that SGML itself > isn't capable of parsing. This shows a reasonable amount of > forethought (wish certain "commercial" standards had half a mind > to do so). We obviously can't foresee all possible media types. > But we can plan for their advent. Ditto for the setext... no limits on encapsulated data types. Anything that can be encoded in transportable manner may be appended last after the human-readable portion of a document and, optionally, made into by-default-in-setext-viewers suppressed matter (in three different ways). Yet, although a dedicated browser is always a preferable solution, setexts do not automatically _require_ one in order to be viewable. This in marked [sic!] contrast to the in the SGML_FAQ_0.0_ expressed statement: # 99% of the fun with SGML can be had only with a parser, # so you do need one. ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ A thought or two ------------------ If past experiences are anything to go by, the biggest obstacle to wider acceptance of the setext seems to be a common inability to think in terms of other document models than those indented for paper printout. Surprizingly many people, even among the _hypertext.rules_ community, seem unawares that they **really** are subconsciously thinking of text ending up on paper, rather than (and despite any usual claims to the contrary of) that of all-electronic delivery and "consumption." That and a second, equally-common misconception, that anything that's understandable also must be primitive, thus defacto unusable for the higher task at hand, whatever the latter may be. As an extra service for the diagram-und-table individuals among yourselves here is an off-the-cuff attempt to summarize some of setext's attributes in relation to those of SGML and RTF. Not an expert on either one of these I ask for your forgiveness should I happen to have misrepresented something. Easy-O-Meter[tm] ----------------- ______________ ___________ RTF ___________ SGML __________ setext basic document flat file with an entity made any text file model embedded typo- up of definable interspaced by graphic /tags logical elements subheads (also and no sense of denoted by rigid other unobtrusive syntaxt (format syntax and un- optional elements proprietary) ambiguous may be employed) -------------- --------------- ---------------- ----------------- generalized no YES yes markup? -------------- --------------- ---------------- ----------------- primary richtext machine-assisted bringing order to objective interchange large-scale amorphous online- format text processing distributed data -------------- --------------- ---------------- ----------------- papercopy-as YES yes NO ultimate- and -objective noo document model -------------- --------------- ---------------- ----------------- smallest a character a character a word emphasis (multistyled) (multistyled) (single style) granularity -------------- --------------- ---------------- ----------------- type of tags /descriptor <\end> _this_ and ~that~ # employed ? unlimited # 2 + 11 optional -------------- --------------- ---------------- ----------------- #typographical a finite set unlimited set 3 typographical tags employed? 1 hypertextual -------------- --------------- ---------------- ----------------- tag overhead +25%? +30%? +9% (verified) -------------- --------------- ---------------- ----------------- parser/browser yes YES no required -------------- --------------- ---------------- ----------------- encoder no YES no required but recommended but would be nice -------------- --------------- ---------------- ----------------- availability many commercial a few commercial a free browser of tools readers full-scale for the Macintosh a few authoring implementations several end-user implementations 1 known Windows implementations a few freeware free browser + 1 PC/ unix parser resources free source code engine undergoing parser/ browser tests -------------- --------------- ---------------- ----------------- installed predominantly professional/ 50,000-100,000 base word processors large, always weekly readers (under Windows) requiring predominantly Mac 93-04-23 MS Word native dedicated tools growing fast ============== =============== ================ ================= Wrapping it up in more ways than one -------------------------------------- As an afterthought: it may come as a surprize to everyone that the SGML , penned by Erik Naggum_ comes up in the Easy View browser for the Mac with certain of its elements emphasized as underlined richtext (version 2.3.1 of the EV, as yet being debugged, do not ask for a copy, please). Why is it so, you may wonder, has Erik been forced to employ some ``bastard'' format because SGML wouldn't do? No, of course not. Erik, at the time of writing it definitely oblivious of the very existence of setext, has simply seen the need to add _visible_ emphasis to a FAQ intended for wide distribution, in a fashion that's commonly used on the net. The setext neither has ambition nor makes any claims to be a "revolutionary" markup method -- whenever it was possible I had formalized the best of the current online usage and called it setext typotags this-and-that. Thus this SGML_FAQ_ has defacto been _enhanced_ in its plaintext state with no extra explicit encoding overhead. Now and then I also see on the net examples of what I'd call spontaneous-setexts, texts subdivided with valid setext subheads and title elements by their makers with no apparent knowledge of it whatsoever. If neither of this provides a strong argument for _usability_ of the method as such then I don't know what else might do. Yes, ------ this posting is a setext (the word stands both for the method and a single structure-enhanced text). Had you been reading it in a dedicated mail shell_ or newsreader_ you could have been presented with something akin to: (306) "Re: Looking for Electronic Publshing formats..." (Ian Feldman, Keep... ----------------------------------------------------------------------------- SGML vs setext <0> setext in multimedia <1> A thought or two <2> Easy-O-Meter[tm] <3> Wrapping it up in more ways than one <4> Yes, <5> and then been able to access its parts in non-linear fashion. If nothing else then at least the setext has a capacity to provide unambiguous yet unobtrusive _anchors_ within texts that are supposed to be universally accessible everywhere. WWW, WAIS_ and Gopher people please take note. There are other markup formats and many may well be "better" for their respective applications but generally speaking there are no other that can make the following claim: there is MORE to me than meets the eye. __Ian "Xanadude in waiting" Feldman XU/Server[tm] not responding -- still trying $$ .. ; The following matter may be more in the realm of wishful .. ; thinking than is the rest of the setext, as no browser yet .. ; exists to parse and execute here enclosed linkage information. .. ; The principle of parseable anchors in text and expanded URLs .. ; (Universal Resource Locators) appended last has been fully .. ; validated by a similar construct deployed in WorldWideWeb's .. ; email server document format however. In addition to that all .. ; setext lines matching the reg-exp ^\.\.\s[^\.]* will by default .. ; be supressed from view but still recognizable to the parsing .. ; front-end. Finally, the enclosed list of links is appended .. ; in alphabetically-reverse order, to provide any browser with .. : a minimal check that the list has, indeed, been generated by .. ; mechanical means, therefore may easily be trusted when decoding .. ; and executing the linkage data. Lines that contain no links but .. ; comments like this one have furthermore been made distinguishable .. ; by mechanical means from other suppressed matter --so that a .. ; browser may filter them out easily prior to verification of the .. ; trustworthiness of the links. Primitive, eh? .. ; .. _this news:1qn588INN27o@uwm.edu .. _solution news:19930420.063124.67@almaden.ibm.com .. _shell (in the domain of wishful thinking .. _query (by Gregory R Block ) .. _publishable (or "Publshable" since nobody following it up in the beginning has corrected the spelling and now it is too late for machine-readable-reference reasons) .. _newsreader (in the realm of wishful thinking) .. _mothered (with Adam C Engst of TidBITS acting a remote midwife) .. _email_server | mail listserver@info.cern.ch \n\nsend http://info.cern.ch/hypertext/WWW/TheProject.html \nstop\n^D .. _elements (smallest element being a word) .. _declared news:19930420.063124.67@almaden.ibm.com .. _data-notations_limits news:19930416.063132.922@almaden.ibm.com .. _author (Gregory R Block ) .. _adds news:4942@ulysse.enst.fr .. _WAIS (a routine to automatically "explode" setext matter into individual topic files for WAIS-server use already exists) .. _URLs (Universal Resource Locators, an Internet draft standard to specify path to accessible resources; see further WorldWideWed FAQ v0.1, file://rtfm.mit.edu/pub/usenet/news.answers/www-faq) .. _SparcStation (substitute favorite here <\WorkStationName>) .. _SGML_FAQ_0.0 (file://ftp.ifi.uio.no/SGML/FAQ.0.0, now expired, new draft promised) .. _SGML_FAQ | mail -s "please send current FAQ" \n^D .. _Prindeville_ (Philippe-Andre Prindeville ) .. _Naggum (Erik Naggum ) .. _NewsGrazer (richtext-news front-end for the NeXT, written by Jayson Adams of NeXT, Inc.) .. _Kimber (Eliot Kimber ) .. _Feldman (Ian Feldman, The Current Setext Oracle ) .. _Block (Gregory R Block ) .. # original headers, suppressed on account of appearing AFTER a twodot-tt # Path: random.se!ianf # From: ianf@random.se (Ian Feldman, Keepers of The Setext Flame[tm]) # Newsgroups: alt.hypertext,comp.multimedia,alt.news-media,comp.text,comp.text.sgml,comp.sys.amiga.multimedia # Date: Fri, 23 Apr 93 07:53:50 +0200 # Message-ID: # References: <1qn588INN27o@uwm.edu> <19930416.063132.922@almaden.ibm.com> <19930420.063124.67@almaden.ibm.com> <4942@ulysse.enst.fr> # X-References: <1qff1hINNf5u@uwm.edu> <1993Apr16.011307.20939@gallant.apple.com> # <19930416.063132.922@almaden.ibm.com> <1993Apr16.175131.28736@gallant.apple.com> # <1qn506INN27o@uwm.edu> <1qn588INN27o@uwm.edu> # X-More-References: <4939@ulysse.enst.fr> <4942@ulysse.enst.fr> <19930419.113449.182@almaden.ibm.com> # <1993Apr19.203208.2751@ornl.gov> <1993Apr20.004712.4298@gallant.apple.com> # <1993Apr20.005046.4406@gallant.apple.com> # X-Even-More-References: <19930420.063124.67@almaden.ibm.com> # <2AWMs*7c1@dynam.adsp.sub.org> <1r2n6p$m1n@nigel.msen.com> # Followup-To: alt.hypertext,comp.multimedia,comp.text,comp.text.sgml # X-Note: --------------------------------------------------------------- # X-Also: First Mac browser for setext, the structure-enhanced ASCII text # X-This: format in sumex-aim.stanford.edu:/info-mac/app/easy-view-22.hqx # X-Note: --------------------------------------------------------------- # X-URL: file://garbo.uwasa.fi/mac/tidbits/setext/setext+sgml.etx # Reply-To: setext-list-request@random.se # Content-Type: setext/plain; charset=ascii_827 # Organization: random design -- "Opinions, cheaply" # Lines: 306 # Summary: setext is to plaintext as RTF is to RTFM # Subject: Re: Looking for Electronic Publshing formats... [long]