Talk:Character (computing)
From Wikipedia, the free encyclopedia
Contents |
[edit] Disambig?
Note partial overlap with brief description at Character, and numerous links that point there that perhaps should point here. --Brion 21:41 Aug 19, 2002 (PDT)
- I agree. The most prominent link is Character (word) and it should be mentioned in the article. Character (computing) is just an abstraction of it. Mordomo (talk) 13:03, 23 December 2007 (UTC)
-
- Hold on there. You're replying to a 5+ year old comment, when the character disambig page looked like this and the character (computing) article looked like this. Regardless, you seem to be misunderstanding what Brion was saying. He was saying that in various other articles, there are references to computing-type "characters" that are linking the word character to the general character disambig page, whereas the links really should point to the character (computing) article instead. That may still be true; I haven't checked. Brion said there was "overlap" between the two, but I think he was just not realizing (since it wasn't tagged) that character was a disambig page.
-
- See, character is a disambiguation page, meant only to direct readers to the various "character"-related articles on Wikipedia. Character (computing) is one such article; it describes the kind(s) of characters used in computing and other telecommunication technology. You might think of the disambig page as covering character in the general sense, and character (computing) being a specialization, not abstraction, of it.
-
- The character (word) article appears to actually be a Wiktionary entry masquerading as a Wikipedia article / disambiguation page, itself. Its content is just a summary of several different definitions of the word character, including a character in literature and applications of the word character to what'd be more accurately described with more specific terms (e.g., grapheme). Up until now (I just changed it), it was also somewhat mischaracterized on the current version of the character disambig page. I'm going to nominate it for deletion or merging since it's serving the same purpose as character and the Wiktionary entry.
-
- In any case, there's generally no need to link back to a disambig page from one of the disambiguated articles. So there does not need to be a mention of either character or character (word) in this article. —mjb (talk) 02:40, 24 December 2007 (UTC)
[edit] Text as a Medium
The following was added to the article by an anonymous contributor. It was removed because it strays from the main topic and contains generalizations and inaccuracies in contradiction with the rest of the article. It also contains some info that may be useful to incorporate into the article; the relationship between astract characters, bytes, and different aspects of glyph renderings might be worthy of inclusion, to the extent that it helps the reader understand the nature of a "character" in computing and machine-based telecom. However, I feel strongly that this article should not be a full tutorial on all things "text" in computers; there is precedent for creating separate articles for these broader topics. — mjb 07:54, 26 September 2005 (UTC)
- Text, as it is represented on a computer, is a linear sequence of bytes that are mapped from bytes (or collections of bytes) onto glyphs.
- Sometimes these glyphs, or their numeric byte forms represent special relationships between the glyphs, such as those used in markup languages. These relationships may be semantic or stylistics.
- The linear sequence of bytes is transformed into a tree structure for the purposes of determining both forms of relationships. The linear sequence is broken up into pieces and then annotated with further, more specialized and detailed linear sequences of bytes within this tree that represent the location and style each glyph will be presented in its visual format.
- After the positioning and style has been determined, the linear sequences of bytes representing each text fragment are transformed into either vector or raster representations of each character. Sometimes, the positioning, layout and form of the glyphs depend on cursory information about the extent and classification of each glyph. Kerning is one process by which glyphs are combined based on rules regarding each character in relation to other characters to achieve an aesthetically pleasing layout of those characters.
- Once the layout and visualization is determined, the glyphs are transformed from a tree populated with annotations and glyphs into a two dimensional raster image that may be broken up across pages based on higher level rules regarding the placement of larger combined units of text (such as paragraphs and their visual relationships to each other.
- The positioning of glyphs in relation to each other is also defined using an order acceptable in the language that the glyphs are part of. For example, some texts are ordered glyph:left to right, line:top to bottom, page:front to back whereas others may be ordered glyph:top to bottom, line:right to left, page:back to front. Other orderings exist.
[edit] Pronunciation of "char"
How do you read "char" in the context of a programming language (such as C/C++, Java or Pascal)?
Does it sound like the verb "char", like "car", or like "care"?
aditsu 08:48, 12 April 2006 (UTC)
- The Jargon File gives all three pronunciations. Most American programmers I've spoken with use the "char" pronunciation, since that's how it's written. "Car" seems to be the rarest pronunciation -- could be because it collides with the name (taken from Lisp) of a primitive operation on linked lists. --FOo 09:09, 12 April 2006 (UTC)
- I always pronounced it like "care", maybe because im from the westcoast. 75.15.236.62
[edit] Char vs. Bit/byte
What is the difference between a char (character) and a byte/bit? I work with some programing and cant figure it out, Is a byte and a character the same length i.e. does 255 bytes/bits equal 255 characters? 75.15.236.62
- Only if your characters are formed from single octets. So this bytes=characters stuff was widely true in the 70s and 80s when ASCII and EBCDIC reigned supreme, but now, as multi-byte characters sets such as Unicode are becoming common, you an no longer assume that one character equals one byte.
- Atlant 12:53, 27 April 2007 (UTC)
[edit] What's this called?
I want to know what to call those symbols in computers that mean it can't understand what they mean. An example of this is: ɪ (If you have a more advanced computer, you may see it as an acual character). C Teng (talk) 00:02, 11 March 2008 (UTC)
- Good question. They don't have a name that I'm aware of - they're just placeholders. Dcoetzee 00:09, 11 March 2008 (UTC)
- That particular character, ɪ, which looks like a miniature capital "I", is U+026A: "lax close front unrounded vowel" ... it is a phonetic symbol and refers to what we call the short "i" sound in English. See International Phonetic Alphabet, IPA chart for English, and Help:IPA for more info. IPA characters are used in pronunciation guides in Wikipedia articles, usually via one of the IPA templates.
- If you see it as a box or a diamond with a question mark in it, then it is indeed just a placeholder; it means that the font your browser tried to use to show it to you does not contain a glyph for that character, and the browser wasn't able to find a suitable glyph in another font. Some browsers are better at finding missing glyphs than others. —mjb (talk) 01:02, 11 March 2008 (UTC)