Wiktionary talk:Criteria for inclusion

From Wiktionary, the free dictionary
Latest comment: 4 months ago by Geographyinitiative in topic "Roughly speaking"-->Independent requirement in WT:ATTEST
Jump to navigation Jump to search

Archives

[edit]

Old threads have been archived to:

About

[edit]

Proposed changes to the CFI can be made and discussed at Wiktionary:Editable CFI.

Subpages of Wiktionary:Criteria for inclusion and its talk page:


Generic use

[edit]

The attributive rule got voted out, but I didn't object to its idea, just its wording. It's was as badly worded as can be imagined. It would be nice to add it back in a new form, that is fully explaining what it means. A few points

  1. Most important IMO and most potentially controversial, specific entities should not 'require' generic use, but generic use should be one way for an entry to pass. Therefore if Late Latin isn't used generically, it won't be deleted.
  2. Uncontroversially, the wording should be precise and leave as little room for doubt as possible. For example, attributive use could mean grammatical attributive use. So David Beckham haircut would be attributive use of David Beckham to modify haircut. Generic use, IMO, should be a meaning other than the primary one. So Billy Elliot would pass because of three citations of 'a Billy Elliot' referring to a young male dancer. All three citations would have to back up the same meaning, not just any meaning. Mglovesfun (talk) 08:38, 4 August 2010 (UTC)Reply

Can you give some context please. What part of the CFI are you proposing to modify? What do you want to see included/excluded that isn't currently? Why? You appear to be talking about both generic use and attributive use, (although it's not clear what the uses are of) yet the section title is just generic use? Can you link to the vote in question so we can see what it was about and what the wording was? Have you got any specific wording in mind or is this just a statement of desire for someone to do something about something? Thryduulf (talk) 09:27, 4 August 2010 (UTC)Reply

Wiktionary:Votes/pl-2010-05/Names of specific entities. Now that we don't have an attributive use rule, I'd like a generic use rule. I'll try and work out some wording when I have time. Mglovesfun (talk) 10:22, 5 August 2010 (UTC)Reply

{{look}}

General rule incomplete for Wikisaurus

[edit]

The current general rule is "A term should be included if [and only if?] it's likely that someone would run across it and want to know what it means."

I think this rule is incomplete for Wikisaurus. For the most part, this rule addresses reading. A thesausus primarly addresses writing. You know the definition of a word (possibly with great attention to its subtleties); you just want to find a synonym.

I would suggest that we ammend the rule to read something like "A term should be included if it's likely that someone would run across it and want to know what it means or (for Wikisaurus) that someone would want to find a word or phrase with a similar meaning." Or we could deal with this in a new subsection. Or even in a brand new project page: say, CFI for Wikisaurus.

The reason for this is that the current rule has led us to delete thousands upon thousands of Wikisaurus idioms and slang phrases that do not meet the current criteria for inclusion in Wiktionary but, in my opinion, are useful in finding a synonym. A phrase may be defined by its words (and thus need no separate entry in Wiktionary) but still be useful in finding a synonym idiom with just the right connotations.

For example, the phrase "exercise my anus" was an entry in Wikisaurus's defecate in January 2009 but doesn't appear now. Similarly, "bikini stuffers" was a synonym for breasts in March 2006 but not now. If I was writing a story about a lazy summer on the beach, I might want to use "bikini stuffers" instead of "racks" or "boobs".

The Wikisaurus:Breasts entry now includes the warning "Only words that meet criteria for inclusion can be included." My point is that it's incomplete for thesauruses. --RoyGoldsmith 01:10, 29 January 2012 (UTC)Reply

I don't think it unreasonable to include things in Wikisaurus that we don't have entries for. I don't find your examples the most convincing, though.--Prosfilaes 11:16, 30 January 2012 (UTC)Reply
Why should Wikisaurus have laxer criteria for inclusion? Shouldn't a writer using this reference be confident that a suggested synonym is an attested part of the language, rather than some nonce coinage? Furthermore, should the writer find an unfamiliar word, he would also benefit from a detailed definition, notes about connotation, register, usage, etc., so she can employ the word correctly and appropriately. We have no business making up such entries out of the blue.
I don't see how we can include such terms sufficiently well, and I it looks to me like this dictionary would suffer from their inclusion, while the thesaurus wouldn't really benefit. Michael Z. 2012-01-30 22:15 z
To answer my own question, perhaps currently-popular neologisms, too recent to meet our CFI could be included, but I would think they should be labelled as such. Michael Z. 2012-01-30 23:12 z
Because often the best term is SOP. Frequently, "at the seashore" is better then "littoral" and "named after a person" is better than "eponymous". But those two are rightly not eligible for entries.--Prosfilaes 00:21, 31 January 2012 (UTC)Reply
Perhaps, but those feel more like definitions than something I would be hoping to find in a thesaurus. Michael Z. 2012-01-31 16:18 z
Can't thesaurus entries include non-dictionary-worthy items, just not making them links? Equinox 22:59, 30 January 2012 (UTC)Reply
We used to have pages like Wikisaurus:breasts/more, but they were so embarassing the community decided to get rid of them. -- Liliana 16:24, 31 January 2012 (UTC)Reply
OK, let's say I'm a writer, writing a story about the beach. I might want to use "bikini stuffers" (if I knew about it) as opposed to "breasts" or "boobs" or "rack". Breasts are too clinical, boobs too teenage-ish and rack too racy. I might want something that specifically relates to swinsuits and bikinis. Roget's splits terms by numeric category so that you can find exactly the same meaning terms (synonyms), almost-exact-meaning terms (hyponyns or hypernyms), sort of the same meaning terms, terms that are related in some vague sense to other terms and, particularly, terms that are like other terms but with a different connotation.
In another example, "spitting chips" (a current red link) does not have the exact same connotations as any other synonym for Ws:angry. Are we supposed to leave out spitting chips (or bikini stuffers) just because they aren't listed as terms in Wiktionary? (And I'm not saying that spitting chips or bikini stuffers deserve an entry in the main dictionary.)
Remember, a thesaurus is not merely a list of synonyms. And it certainly isn't a list of attested synonyms. Creative writers want "nonce coinages" and one-time-only usages. In certain cases, they may even prefer phrases that have yet to see the light of publication. If a reader wants only attested usages, they always have blue links vs. either red links or straight text. For non-published phrases, I would isolate them on /more pages. --RoyGoldsmith 05:15, 1 February 2012 (UTC)Reply
Are there professional thesauruses that make up new words, or are we about to corner the market in this category? Is there evidence of demand for this service, or is that speculation? Michael Z. 2012-02-01 17:35 z
What is a professional thesaurus? Do you mean published thesaurus? Perhaps even "published in paper-book form"? Or do you mean well-known thesauruses? Well, in Roget's International Thesaurus (5th edition, published by Harper Collins in 1992) has tons of new terms. On virtually every page, in virtually every entry, you have roughly as many new reference terms (not in the thesaurus as a root) as root terms. For example, the first entry on page 1 is BIRTH. It has 28 noun terms, including "having a baby", "giving birth", "the stork", "birth throes", "blessed event" and so on. Most of these phrases are not used as root terms.
What I'm trying to say is that there should be NO connection between a dictionary (used for looking up meanings) and a thesausus (used for finding like-meaning terms). The methodology for constructing them is totally different. For one thing, idiomaticity gets thrown out of the window. In a thesaurus, you want the root terms to be "easily derived from the meaning of [reference terms'] separate components" but a reference term might not be easily derived from the root term. For example, you can easily figure out that "bikini-stuffers" means "breasts" but, given the concept of breasts, you probably would not derive the phrase bikini-stuffers. --RoyGoldsmith 04:36, 5 February 2012 (UTC)Reply
(Late) Correction of "the community decided": the editors decided. --Dan Polansky (talk) 09:52, 2 January 2023 (UTC)Reply

A summary?

[edit]

I often thought that CFI is quite long and dense to read, as it uses a lot of technical terms. This makes it hard for newcomers to understand, which is a problem because they are the ones who need it the most! So maybe it would be a good idea to provide a short summary of the most important parts of CFI, maybe on a separate page, in simple "welcome message"-style prose? —CodeCat 19:11, 8 April 2012 (UTC)Reply

Text for COALMINE.

[edit]

Currently, § "Idiomaticity" ends with this indented, italicized paragraph:

:''The vote [[Wiktionary:Votes/pl-2009-12/Unidiomatic multi-word phrases to meet CFI when the more common spelling of a single word]] adds a criterion for inclusion without specifying text to be amended in this document, so please see it for the additional criterion.''<ref>([[WT:COALMINE]]) [[Wiktionary:Votes/pl-2009-12/Unidiomatic multi-word phrases to meet CFI when the more common spelling of a single word]]</ref>

I'd like to propose that it be replaced with this unindented, unitalicized paragraph:

If a collocation is significantly more common than an included single-word spelling, then the collocation ismay be included as well, even if it is unidiomatic or debatable. For example, {{term|coalmine|lang=en}} is well attested, but {{term|coal mine|lang=en}} is significantly more common, so both are included, regardless of whether {{term||coal mine|lang=en}} is otherwise idiomatic.<ref>([[WT:COALMINE]]) [[Wiktionary:Votes/pl-2009-12/Unidiomatic multi-word phrases to meet CFI when the more common spelling of a single word]]</ref>

(O.K., so that wording isn't great. But it's an improvement over what we've got now. And I'd welcome further improvements.)

RuakhTALK 21:05, 17 August 2012 (UTC)Reply

I'd support that, but this isn't the venue to propose it in. --Μετάknowledgediscuss/deeds 21:20, 17 August 2012 (UTC)Reply
Why not? —RuakhTALK 21:22, 17 August 2012 (UTC)Reply
I expect non-admins don't watch this page. --Μετάknowledgediscuss/deeds 21:26, 17 August 2012 (UTC)Reply
I think it should read "then the collocation may be included as well" rather than "then the collocation is included as well." "Should be included" would be fine too, AFAIAC. DCDuring TALK 23:10, 17 August 2012 (UTC)Reply
I see no need for such a change. However, if we're to make it, then I think the text should indicate in its discussion of the coal mine example that coal mine and coalmine are forms of the same word phrase thing, which it doesn't now.​—msh210 (talk) 05:13, 11 September 2012 (UTC)Reply
Er, also in the normative part. As currently worded, it allows the house as significantly more common than encephalon.​—msh210 (talk) 05:20, 11 September 2012 (UTC)Reply

Discussions of durability

[edit]

Because it may be useful to have this index of them, here are some past discussions of durability:

- -sche (discuss) 23:43, 18 August 2012 (UTC)Reply

See also WT:SEA for a list of archives and engines that are useful when looking for durable citations. - -sche (discuss) 20:45, 3 August 2015 (UTC)Reply
Types of media which have been cited in entries:
Note that this list is not concerned with whether particular types of media were necessary to attest a term, as in liliger and ΖΟΑΠΑΝ and ᚅᚔᚑᚈᚈᚐ, or were merely supplemental, as in deutsch. - -sche (discuss) 20:45, 3 August 2015 (UTC)Reply

Formatting of misspellings.

[edit]

I assume it's uncontroversial to change this:

Once it is decided that a misspelling is of sufficient importance to merit its own page, the formatting of such a page should not be particularly problematical. The usual language and part of speech headings can be used, followed by this simple entry:
# {{misspelling of|[[...]]}}
An additional section explaining why the term is a misspelling should be considered optional.

to this:

Once it is decided that a misspelling is of sufficient importance to merit its own page, the formatting of such a page should not be particularly problematical. The usual language and part of speech headings can be used, followed by this simple definition:
# {{misspelling of|...}}
An additional section explaining why the term is a misspelling should be considered optional.

?

(I.e., changing "entry" to "definition", and removing the [[ and ]] from inside {{misspelling of}}?)

RuakhTALK 15:22, 10 September 2012 (UTC)Reply

Never assume.  :-)  But I, for one, support such an edit without a vote.​—msh210 (talk) 16:29, 10 September 2012 (UTC)Reply
I would support something like "... followed by a simple definition using the following format:". --BB12 (talk) 18:28, 10 September 2012 (UTC)Reply
I support the spirit of the change, but it needs to mention and explain the lang= parameter too. —CodeCat 19:21, 10 September 2012 (UTC)Reply
O.K., first of all, @BenjaminBarrett12 and @CodeCat: your comments imply that you don't support the currently proposed changes unless modified as you propose. If you don't, then — why don't you? Do you not consider them to be improvements? Do you feel that they're too minor, on their own, to warrant editing WT:CFI? Something else? (I ask because part of the point of being able to make uncontested changes after mere discussion, without a full vote, is that it allows smaller changes to be made piecemeal, without much bureaucracy. If lots of people jump on and add riders, refusing to support the original change, then I think we'll end up back where we started. I hope that you two aren't holding this change "hostage" to other changes you want.)
Those questions out of the way . . . how about:
Once it is decided that a misspelling is of sufficient importance to merit its own page, the formatting of such a page should not be particularly problematical. The usual language and part of speech headings can be used, followed by a simple definition using the following format:
# {{misspelling of|occurred|lang=en}}
An additional section explaining why the term is a misspelling should be considered optional.
? (This incorporates BenjaminBarrett12's change; it adds lang=en per CodeCat — though I suspect that now DCDuring will object; and it uses the "occurred" example from earlier in the "Spellings" section, rather than .... This last part is because lang=... was too vague, and I feared that of|...|lang=en could be taken to imply that only English misspellings are allowed, whereas of|occurred|lang=en seems more obviously just an example.)
RuakhTALK 19:41, 10 September 2012 (UTC)Reply
Some people here think black and white like that but I try not to. Any improvement is good, even if it's not yet the end result I would prefer. I support the change, but I'm also pointing out that it can be improved further and that I would prefer that. —CodeCat 19:51, 10 September 2012 (UTC)Reply
Basically ditto. --BB12 (talk) 21:39, 10 September 2012 (UTC)Reply

Support with all changes (Ruakh's, BB's, and CodeCat's) --Μετάknowledgediscuss/deeds 14:00, 11 September 2012 (UTC)Reply

Does that mean that you only support if all changes are made? Or do you support each change independently? —RuakhTALK 22:44, 11 September 2012 (UTC)Reply
Independently. In general, you can assume that my votes for certain changes must be enacted together iff I say "iff". --Μετάknowledgediscuss/deeds 00:27, 12 September 2012 (UTC)Reply
Support Ruakh's original change, don’t mind (would support, but would also be OK without) BB's or CC's. - -sche (discuss) 19:21, 11 September 2012 (UTC)Reply
I support Ruakh's and Ruakh+BB12's also.​—msh210 (talk) 20:15, 11 September 2012 (UTC)Reply
Does that mean that you object to CodeCat's change, or merely that you don't actively support it? —RuakhTALK 22:44, 11 September 2012 (UTC)Reply
It's a good idea in theory, but I can't think of an implementation that is not too wordy or awkward and that refers to English also (not only foreign entries). So I suppose I'm opposed to the exact wording proposed above while in favor, perhaps, of another.—msh210℠ on a public computer 03:27, 12 September 2012 (UTC)Reply
lol, this is almost as bureaucratic as a vote... - -sche (discuss) 23:14, 11 September 2012 (UTC)Reply
"Fancy thinking the Bureaucracy was something you could hunt and kill!" said the head. "You knew, didn't you? I'm part of you?" —RuakhTALK 23:29, 11 September 2012 (UTC)Reply
I agree with Sche. This really isn't that complicated, guys. Nitpicking ≠ consensus-gathering. --Μετάknowledgediscuss/deeds 00:27, 12 September 2012 (UTC)Reply
I mean to say, אוי#Yiddish. --Μετάknowledgediscuss/deeds 00:38, 12 September 2012 (UTC)Reply
I support Ruakh’s and CodeCat’s changes, and don’t mind Benjamin’s. I also assume it’s uncontroversial. — Ungoliant (Falai) 23:32, 11 September 2012 (UTC)Reply

Archaic inflected forms

[edit]

There is a number of verbs in Russian that have multiple choices of inflection for all forms of the same verb (not just one complementary form). And these forms are equal in use, for example: "брызгать": я брызгаю/я брызжу, "алкать": я алкаю/я алчу, "рыскать": я рыскаю/я рыщу etc. For some of the verbs there is a contemporary way of inflecting them and the old one, that was used in the 19th century pretty widely (by Russian classic writers). The old ways of inflection might even encounter as main ones in some grammar books of the beginning of 20th century. Lexicographers, of course, do mention the contemporary way of inflecting in today's dictionaries and either omit the old ones or mark them as archaic. Sometimes the words get considered as w:defective verbs in the new dictionaries: they even get infinitive as not-existent, while preserving the most widely used forms, for example: an infinitive "обымать" is stated to be eligible only for the standard conjugation, but for the old one it's not longer considered as an infinitive; for the old conjugation ("объемлю" 1st.p. pres.) only present tense is considered existent.

Since the main purpose of Wiktionary is to describe all words (and their forms) despite their outdatedness to be able to search for any form, my idea is to specifically prescribe in the Wiktionary policy, that all, even old forms (which belong to this language not the Old Language counterpart) to be included in the word articles, no matter there are in use today.

As an example, I made this article with two conjugations with the second one marked as old one. The other user, being guided by today's dictionaries moves the second conjugation into a defective verb article. Please, arbitrate, who was right, and prescribe the correct way of dealing with such cases. Soshial (talk) 16:30, 12 January 2014 (UTC)Reply

I don't know if that's really feasible. Sometimes there are many different ways that an old form was written (seien is an extreme example), and we can't fit all those forms into one table. —CodeCat 17:02, 12 January 2014 (UTC)Reply
I see, but I was talking not about spelling variants, but about forms that are equal in usage but their production belongs to different classes of conjugation. Soshial (talk) 18:14, 12 January 2014 (UTC)Reply

children's language

[edit]

Do we need to tweak CFI for children's language? For example, i seem to remember from long ago that "pesk" was used at least by children as a noun in English in the USA (S/He's a real pesk.). I haven't lived in an English-speaking country for a long time, so i don't know whether it's still used. Websites and even Google Books do a bad job of recording the language of children, so i'm not surprised Google only finds very few hits for "a real pesk". --Espoo (talk) 10:45, 3 February 2014 (UTC)Reply

Slang and dialect will be hard to cite, but I don't see any reason to change our rules. On one hand, IMO the citation rules are important in keeping words that people might actually find and look up instead of pretending to cover all unrecorded slang. On the other, children's language is at the bottom a hopeless mire; every family has its own cute mispronunciations and English spellings will vary over the map.--Prosfilaes (talk) 15:00, 3 February 2014 (UTC)Reply

Widespread use

[edit]
A user suggests that this Undetermined talk page be cleaned up, giving the reason: “Please define "clearly widespread use" (as in "“Attested” means verified through clearly widespread use"). Does it include "clearly widespread use" in spoken usuage in a (larger) region? Or has it to be verified by a search engine (like google) through its search results (many search results = word is attested)?”.
Please see the discussion on Requests for cleanup(+) or the talk page for more information and remove this template after the problem has been dealt with.

As a practical matter, it is resolved by vote, but evidence of any kind may be appropriate. It is intended to allow for acceptance of colloquial expressions not appearing in print and expressions that are hard to otherwise cite because they are swamped in search by even more common terms (eg, abbreviations) and to reduce abusive use of {{rfv}}. It is clearly beneficial to Wiktionary not to overuse this rationale as users benefit from citations, even more than they benefit from manufactured usage examples. DCDuring TALK 16:06, 13 March 2015 (UTC)Reply

[edit]

On the Attestation section I suggest changing:

"As Wiktionary is an online dictionary, this naturally favors media such as Usenet groups, which are durably archived by Google."

...to:

"As Wiktionary is an online dictionary, this naturally favors media such as Usenet groups, which are durably archived in a distributed fashion by Google and others. For more information on searchable external archives that are considered durable, see Wiktionary:Searchable external archives."

SageGreenRider (talk) 15:31, 14 April 2016 (UTC)Reply

Requested changes to protected page

[edit]
Sentences
  • “This is meant to filter out words that may appear and see brief use, but then never be used again.”
  • “Constructed languages have not developed naturally, but are the product of conscious effort in the fulfillment of some purpose.”
  • “Terms originating in fictional universes which have three citations in separate works, but which do not have three citations which are independent of reference to that universe may be included only in appendices of words from that universe, and not in the main dictionary space.”
  • “Wiktionary classifies all as proper nouns, but applies caveats to each.”

What to change: Remove the comma before “but,” as it does not start an independent clause.


Also, change:
Generic terms are common rather than proper nouns. For example: Remington is used as a synonym for any sort of rifle, and Hoover as a synonym for any sort of vacuum cleaner. (Both are also attested family name words, and are included on that basis as well, of course.<!-- SO this could use BETTER EXAMPLES -->)
to:
Generic terms are common rather than proper nouns; e.g., Remington is used as a synonym for any sort of rifle and Hoover as a synonym for any sort of vacuum cleaner (both are also attested family name words and are included on that basis as well). <!-- This could use better examples. -->

Alternatively, "for example" can be used, instead of "e.g."


Also, change:
including the manufacturer, distributors, retailers, marketers, and advertisers, their parent companies, subsidiaries, and affiliates, at time of authorship
to:
including the manufacturer, distributors, retailers, marketers and advertisers, their parent companies, subsidaries{{,}} and affiliates


Also, change:
[[hypocoristic|Hypocoristics]], [[diminutive]]s, and [[abbreviation]]s of names (such as [[Jock]], [[Misha]], [[Kenny]], [[Ken]], and [[Rog]]) are held to the same standards as names.
to:
[[Hypocoristic]]s, [[diminutive]]s{{,}} and [[abbreviation]]s of names (such as [[Jock]], [[Misha]], [[Kenny]], [[Ken]]{{,}} and [[Rog]]) are held to the same standards as names.


Also, change the two serial commas to {{,}} in the following sentence: '''Given names (such as [[David]], [[Roger]], and [[Peter]]) and family names (such as [[Baker]], [[Bush]], [[Rice]], [[Smith]], and [[Jones]])


Also, change the serial comma to {{,}} in the following sentences: Wiktionary has main articles giving etymologies, alternative spellings, meanings, and translations for given names
and
Examples include the Internet, the Magna Carta, the Mona Lisa, the Qur'an, the Red Cross, the Titanic, and World War II.


More sentences
  • “Terms which are seldom or never used literally are not covered by this rule, and can be included on their own merits.”
  • “are words, and subject to the same criteria for inclusion as any other words”
  • “translations for given names and family names, and has two appendices for indexing those articles:”

What to change: Remove the comma before “and,” as it does not start an independent clause.


PapíDimmi (talk | contribs) 05:58, 31 July 2016 (UTC)Reply


I don't know where you got that rule about commas and independent clauses, but it isn't a thing. They're mostly fine as they are. Equinox 06:22, 31 July 2016 (UTC)Reply

"that most languages, including English, do not have an academy to establish rules of usage"

[edit]

"English and several other major languages don't have a language regulator" I suggest to link it to w:List of language regulators. Chinese, French, Spanish, Russian, Arabic (5 out of 6 UN languages) are regulated. d1g (talk) 01:29, 4 March 2017 (UTC)Reply

The statement is true and there is no need to link to that page. —Μετάknowledgediscuss/deeds 03:31, 4 March 2017 (UTC)Reply
It is true if we count languages without respect to active speakers.
Link above shows that major languages (with many speakers) are regulated, with notable exception of English family.
Personally, I would count speakers of the language. d1g (talk) 09:22, 8 March 2017 (UTC)Reply

Inflections

[edit]

Are inflections of misspellings suppose to be, and if so should they also have the misspelling tag or just the inflection template?Jonteemil (talk) 20:22, 12 August 2017 (UTC)Reply

I'd only include inflections of misspellings if the misspelled inflections are themselves commonly encountered, and I would only label them {{misspelling of}} the correctly spelled inflection, I wouldn't bother with {{inflection of}} the basic misspelling. But that's just my opinion. —Aɴɢʀ (talk) 07:11, 13 August 2017 (UTC)Reply
The fact that we have head|en|misspelling might suggest not. However, IIRC CodeCat objected when I started removing some of these. Equinox 14:32, 13 August 2017 (UTC)Reply
Do I understand, then, that every misspelling should be treated individually on a per-word basis rather than a per-lemma basis? This would mean that we could have a misspelling of a plural form without having a misspelling of the corresponding singular form.
Personally, it makes more sense to me if we treat misspellings the same as any other spelling variant. —CodeCat 15:31, 13 August 2017 (UTC)Reply
If the plural form is frequently misspelled, but the singular form isn't, then why not? Although, I'd like to see a concrete example of a word whose plural (or other inflected form) is often misspelled but whose singular (or other lemma form) isn't, not counting times where it's the additional of the inflectional morpheme that causes the misspelling. (Obviously in cases like abhored it's only the inflected form that is a misspelling; I don't mean things like that.) If all of our inflected misspellings are things like abhored (where it's the additional of the inflection that causes the misspelling) or like accomodations (where the lemma and inflected form are both common misspellings), then this issue is purely academic. —Aɴɢʀ (talk) 21:39, 13 August 2017 (UTC)Reply
I think we should at least note on the inflection page in some way that it is a misspelling. Otherwise someone can look up, say, pharoahs (trying to determine whether it is spelled correctly), see that we have an entry for it, and assume it is spelled correctly. But where exactly this is noted on the inflection page doesn't matter to me. Germyb (talk) 21:53, 13 August 2017 (UTC)Reply
I agree, and I've changed pharoahs accordingly. —Aɴɢʀ (talk) 09:18, 14 August 2017 (UTC)Reply
[edit]

Requesting a simple edit here because page is locked. At WT:ATTEST, please provide a link to attestation at Wiktionary, which in turn also provides the link to attested language at Wikipedia. This link is educational for lay users who otherwise (as the page is now) may not know where the word "attested" came from (e.g., "Is that a Wiktionary thing?" —No, it is a standard thing in all dictionaries). Quercus solaris (talk) 23:31, 9 March 2018 (UTC)Reply

It seemed to me better to link “Attested” (which already has a definition "(linguistics) Of words or languages, proven to have existed by records.") and add the 'pedia link to that page too, so have done that. (It is usually preferable to avoid wikifying section titles, since the link becomes unduly prominent.) --Enginear 03:37, 10 March 2018 (UTC)Reply
Perfection. Thank you! Quercus solaris (talk) 16:43, 10 March 2018 (UTC)Reply

Edit request: Pronouns in idioms

[edit]

The kind folks at Wiktionary:Information_desk/2018/March#Pronouns_in_idioms have opined that the current practice is to use "one" or "one's" with reflexive verbs (the action is being done to the subject) and "someone" or "someone's" with transitive verbs (the action is being done to somebody other than the subject). I think the text at Wiktionary:Criteria_for_inclusion#Pronouns should be updated to explain that; currently it only mentions "one" and "one's". Thanks! -- Beland (talk) 03:52, 26 March 2018 (UTC)Reply

Done Done. - -sche (discuss) 16:38, 26 March 2018 (UTC)Reply

Edit request: Shortcut box for WT:THUB

[edit]

The shortcut WT:THUB has been pointed to the new section about translation hubs, it might be a good idea to add a box for it like for the other shortcuts. --Ørjan (talk) 00:19, 28 May 2018 (UTC)Reply

Done DoneGranger (talk · contribs) 00:38, 28 May 2018 (UTC)Reply

Mirrored websites

[edit]

Sorry if this is an obvious question, I'm not very techy. Does a mirrored website count as durably archived? By that I mean (as far as I understand it) a website that keeps auto-updated copies of itself hosted by various third parties? GaylordFancypants (talk) 19:50, 10 August 2018 (UTC)Reply

No. Stick to physically published things, Google Books, and Usenet, and you'll be good. —Μετάknowledgediscuss/deeds 22:49, 10 August 2018 (UTC)Reply

THUB

[edit]

What is a correct THUB? I fundamentally don't understand how someone could think that a word in one language could give inherent notability to a word in another language. — Mr. Guye (talk) (contribs) 

This isn't Wikipedia; we don't use notability. There are no strict rules, and the guidelines are explained in the link which you yourself have already provided. —Μετάknowledgediscuss/deeds 23:59, 16 August 2018 (UTC)Reply
(edit conflict) It's a new addition to CFI, but one example is day after tomorrow. The point of a translation hub, as I understand it, is to allow us to help readers in finding translations for phrases that are sum-of-parts in English but often idiomatic or otherwise hard to predict in other languages. —Granger (talk · contribs) 00:00, 17 August 2018 (UTC)Reply
@Mx. Granger, Metaknowledge: Could one of you add a footnote with a link to Wiktionary:Votes/pl-2018-03/Including translation hubs, please? Per utramque cavernam 23:04, 2 January 2019 (UTC)Reply
Done DoneGranger (talk · contribs) 09:51, 3 January 2019 (UTC)Reply

Requested edit

[edit]

Sorry don't know the Wiktionary templates for requesting edits, but I noticed an error:

In the Idiomaticity section, the fourth and fifth sentences read:

  • "This criterion is sometimes referred to as the fried egg test, as a fried egg generally means an egg (and generally a chicken egg or similar) fried in a particular way. It generally doesn't denote a scrambled egg, which is nonetheless cooked by frying."

But... not always, I cook them in the microwave sometimes (You can do that with scrambled eggs; not with the sunny-side-up type, I don't think). So I'd recommend deleting a comma and a word and changing a word:

  • "This criterion is sometimes referred to as the fried egg test, as a fried egg generally means an egg (and generally a chicken egg or similar) fried in a particular way. It generally doesn't denote a scrambled egg, whichthat is nonetheless cooked by frying."

Changed "which" to "that" because I think that "that" indicates some subset of scrambled eggs while "which" could be taken to imply that all scrambled eggs are fried. Not 100% sure about that.

You could also change the wording in other ways if you prefer: "This criterion is sometimes referred to as the fried egg test, as a fried egg generally means an egg (and generally a chicken egg or similar) fried in a particular way. It generally doesn't denote a scrambled egg, even if cooked by frying." That's fine too. Or whatever. Herostratus (talk) 06:00, 2 October 2018 (UTC)Reply

Better? Equinox 12:42, 2 October 2018 (UTC)Reply
Now it reads like this:
"This criterion is sometimes referred to as the fried egg test, as a fried egg generally means an egg (and generally a chicken egg or similar) fried in a particular way. It generally doesn't denote a scrambled egg, which may nonetheless be cooked by frying."
Personally, I have never heard of a scrambled egg being cooked by frying, so this reads strangely to me. Mihia (talk) 23:22, 6 June 2019 (UTC)Reply
In my experience, you cook a scrambled egg the same way you do a fried egg, but with a little more scrambling, and a Google search for scrambled egg confirms this. I'm not sure whether you're disagreeing with the definition of frying, but the Internet agrees you generally put oil or butter in a pan and fry some scrambled eggs over a little heat.--Prosfilaes (talk) 05:02, 7 June 2019 (UTC)Reply
Well, it seems that there are differences here, perhaps regional differences, about what "scrambled eggs" are and/or what "frying" entails. Where I come from (southern England), no one would ever, as far as I am aware or have ever experienced, use the term "frying" to refer to the standard method of cooking scrambled eggs. Mihia (talk) 22:16, 7 June 2019 (UTC)Reply
@Mihia: So how do you refer to cooking food in a frying pan? Sautéing? —Mahāgaja · talk 10:19, 8 June 2019 (UTC)Reply
I call it frying (normally). However, I would not normally make scrambled eggs in a frying pan -- unless there was nothing else to hand. Normally I would make them in a small saucepan. If I did have to make them in a frying pan then I would not "fry" them (would not get the pan hot enough to "fry"). Per the scrambled eggs recipe at [1], "With moderate to high heat, your eggs would start sizzling and frying in the butter upon contact, not collaborating with it. Frizzled and frazzled is fine for fried eggs, but for scrambled, we’re looking for a synergistically buttery eggsperience." Mihia (talk) 11:08, 8 June 2019 (UTC)Reply
I've just created an entry for eggsperience, so if nothing else comes of this discussion, at least there's that. For what it's worth, I rarely make scrambled eggs, but when I do I use a frying pan and think the method could plausibly be described as stir-frying. —Granger (talk · contribs) 12:28, 8 June 2019 (UTC)Reply
[edit]

Shouldn't it be changed to Wiktionary:English entry guidelines? Or which page is meant there? Adam78 (talk) 17:21, 1 January 2019 (UTC)Reply

@Adam78 You're right; fixed. — Mnemosientje (t · c) 12:55, 14 March 2019 (UTC)Reply

Treatment of SOP hyphenated compounds

[edit]

During the course of the vote here, a question has arisen as to whether or not the CFI are currently (pre-vote) intended to exclude hyphenated compounds that are non-idiomatic sum-of-parts (i.e. hyphenated compounds that fundamentally mean the same as the unhyphenated words, which are themselves sum-of-parts). This question applies not only to hyphenated compound modifiers, such as those that are the subject of that vote, but also to ordinary (non-attributive) hyphenated compound nouns. The interpretation of the CFI seems to depend (perhaps amongst other things) on whether "single word" in the first paragraph of "General rule" and "expression" in the first paragraph of "Idiomaticity" are intended to include or exclude hyphenated compounds. In terms of policy, is this point settled or open? (If it is settled then I think the CFI text should be made more explicit; if it is not settled then I suppose in an ideal world it should be.) Mihia (talk) 19:20, 6 June 2019 (UTC)Reply

I don't know that it is settled, but I don't hold that hyphenated words are de facto single word terms. - TheDaveRoss 01:22, 7 June 2019 (UTC)Reply

Requested edit 2020-Jan-26

[edit]

Delete the interwiki links d:Q4657574. Taylor 49 (talk) 01:47, 26 January 2020 (UTC)Reply

That's a Wikidata problem; take it up with them. (And I think that Wikidata item is poorly conceived and inherently useless, but again, that's not our problem.) —Μετάknowledgediscuss/deeds 02:11, 26 January 2020 (UTC)Reply
I meant it vice versa. Delete the links from the locked page Wiktionary:Criteria_for_inclusion in favor of the wikidata item. Taylor 49 (talk) 03:17, 26 January 2020 (UTC)Reply

Edit request: Formatting, phrasing, and minor changes in multiple sections

[edit]

I noticed while reading the current revision of the criteria for inclusion that much of the phrasing in it could be improved by putting related sentences that were placed on different lines onto the same one. I further noticed that were tone, formatting, and various other details that needed copy editing, among other things. Overall, I believe that the changes I am proposing do not change the subsistence of the criteria for inclusion, instead they simply make it mean what it says and say what it means, and therefore probably doesn't require any sort of voting. The changes I am proposing are wide ranging, so I have not posted them here. Instead, I have put my proposal in my sandbox and listed a brief list of some of the changes I am putting forward as well as the rationale for the changes that I think need them. The precise differences between what I am proposing and the current revision of the criteria for inclusion can be found here (This is now outdated). The following is the aforementioned change and rationale list:

  • Minor formatting changes
  • Moved related sentences which were previously on multiple lines onto the same line
  • Made some phrasing more specific when it was vague or broader when it was overly specific
  • Redid how the links to the combining acute accent and acute accent existed to more clearly communicate what was being discussed
  • Changed tone/style to be more explanatory and less matter-of-fact
  • Added a note to the section "Wiktionary is not an encyclopedia" as to avoid having it contradict the section "Names of specific entities".
  • Made a change suggested by a preexisting comment
  • Replaced all instances "formulae" with "formulas" with the intent of improving ease of use since "formulas" is much more common worldwide(!) (see [Google Ngram] and [[2]])

The Editor's Apprentice (talk) 19:29, 4 April 2020 (UTC)Reply

Closing: This edit request isn't worth keeping open because it is unreasonable to expect someone to look through the whole large body of edits I am suggesting and because I am continuing to refine what I think the criteria for inclusion should look like. As an alternative pathway, I plan to try and suss out what might be the best way to gather community consensus on each of the categories of edits I am proposing one by one. If and when I do that, I plan to leave a note here pointing to the discussion where the new pathway was identified.—The Editor's Apprentice (talk) 01:29, 17 April 2020 (UTC)Reply
A pathway is identified at Wiktionary:Beer parlour/2020/April#Finding consensus on a large variety of non-content changes. Specifically, the plan is to create a vote with sections for each of the edit categories. I plan to update that section with any further developments and links to associated votes, so this is probably the last edit I'll make to this section.—The Editor's Apprentice (talk) 23:32, 17 April 2020 (UTC)Reply

Fictional names

[edit]

Edit request:

Add "For fictional names, see fictional universes." to Wiktionary:Criteria for inclusion#Names.

In Wiktionary:Criteria for inclusion#Fictional universes remove or change "of persons" in the line "With respect to names of persons or places from fictional universes". It's not restricted to people, the Matrix or ACME aren't people. Alexis Jazz (talk) 13:26, 27 May 2020 (UTC)Reply

Conlangs

[edit]

I can see why we shouldn't include every possible conlang out there, but what's the reasoning behind the 6 specific conlangs included? Why those and not others? Plokmijnuhby (talk) 20:47, 20 August 2020 (UTC)Reply

One reason is that for the smaller conlangs, finding three independent, durably archived citations is difficult or impossible, even for common words. (That's why Lojban was relegated to the appendix.) —Granger (talk · contribs) 22:26, 20 August 2020 (UTC)Reply
I think it was pulled from the list of conlangs in ISO 639-1. In general, Esperanto is the largest, with at least one magazine running for a century now and several in current publication, with tens of thousands fluent speakers and at least a million with some interest in and knowledge of it. Volapük may be dead now, but had some massive usage at the time. Ido and Interlingua had some usage, and I wouldn't have included Interlingue and Novial, which seem to be one man projects that died with their creators. There's some fictional conlangs, which we've excluded for copyright reasons as much as anything, but nothing else has remotely the body of speakers and writings that those languages do.--Prosfilaes (talk) 06:07, 21 August 2020 (UTC)Reply
Toki Pona? Klingon? Solresol? Those are the other big conlangs left out. --165.155.170.46 14:36, 16 June 2021 (UTC)Reply
Klingon is fictional, and neither Toki Pona nor Solresol see nearly enough use. Filling a shelf with books in Esperanto is no problem. You can buy hundreds of books completely in Esperanto from the Esperanto USA bookstore right now. https://wikisource.org/wiki/Main_Page/Volap%C3%BCk has 759 wikipages; not quite a bookshelf full, but decent amount of text, including a full New Testament, and that's just a sample of what's out there. On Quora, someone pointed out the New Testament comes out between Harry Potter and the Prisoner of Azkaban, and the Half-Blood Prince in word count. I'd say before we even start discussing it, I'd like at least one New Testament/Harry Potter sized volume in the language not about the language. If you let it come in several books, Klingon about measures up with Hamlet, Gilgamesh, paq'batlh: The Klingon Epic, and Alice in Wonderland (11/2021 release date) approaching that. Toki Pona and Solresol don't.--Prosfilaes (talk) 03:30, 15 August 2021 (UTC)Reply
Monty Python and the Holy Grail and The Life of Merlin have a Toki Pona Translation.— This comment was unsigned.
For one, translations aren't legal without the permission of the copyright owner, and I'm pretty sure there's no licensed translation of Monty Python into Toki Pona. Copyright infringements are almost certainly not durably archived. Secondly, a movie has, on average, 9,000 words of dialog. That's a long stretch from the hundred thousand word novel I was looking for (and I'd really prefer it in one volume, since then the all the smaller works just exist.)--Prosfilaes (talk) 13:05, 21 August 2021 (UTC)Reply
So there are no big translations except there are but they don’t count. 166.109.26.72 15:37, 19 November 2021 (UTC)Reply
So what are the core criteria for inclusion? Speaker amount? Books written? I'm specially concerned with Volapük, which you say is dead now. Do we have to add other dead languages? I'm asking with genuine intention of understanding, because I feel a bit confused TheChessTyper (talk) 01:20, 14 February 2022 (UTC)Reply
Pinging at @Prosfilaes, who I am assuming you are replying to, TheChessTyper. It is generally smart to ping a user your replying to if a conversation hasn't been active for a month or more as they are unlikely to still be checking it for any changes —The Editor's Apprentice (talk) 05:15, 14 February 2022 (UTC)Reply
Thanks. I'm not used to talk pages. @Prosfilaes, please look at this if you have the time. TheChessTyper (talk) 17:15, 14 February 2022 (UTC)Reply
I don't think there are generally established core criteria. Personally, it's about having a body of durably archived text large enough that words actually can have three cites in independent works. Dead or not is irrelevant. w:Volapük says "In 1889, there were an estimated 283 clubs, 25 periodicals in or about Volapük, and 316 textbooks in 25 languages". [3] has 756 pages, with a number of different people writing or translating in Volapük. Klingon has Hamlet, Gilgamesh, paq'batlh: The Klingon Epic, and Alice in Wonderland, and a literary journal. You'd be hard pressed to give a word three independent cites from those works, but it's conceivable. If Klingon weren't encumbered copyright-wise, it might get in.
Conventional publishing is rarer today, but I'd really like to see a pile of material in the language, but not grammars or dictionaries, maybe 10,000 pages worth, with at least three major authors and a dozen authors in total. Volapük and Esperanto can do that and it looks like Interlingua can do that. Maybe Ido. One thing books have over email and web forums is that the text has usually been carefully written and edited, and the author is reaching a large group of people who can't or won't ask the author what they meant. They're also more durably archived, and designed to be more permanent; a large novel or cookbook will have future audiences, whereas an insular mailing list, even if it exists 50 years down the road, is unlikely to be seeing many readers.--Prosfilaes (talk) 01:43, 16 February 2022 (UTC)Reply
There are more than three published books in Toki Pona. You might not be able to give three citations for every compound of words that has been to used express a concept, but I am sure all of the basic "pu" words have been used three times. So, should we include it in the main namespace?
[Edit: Looks like my earlier assumption may have been wrong. I guess I just assumed that given how well known it is the material must exist, but no. I only found two published books by the same author, e.g. →ISBN. There are a lot of "books" which are online files, but may not exist in print. If our new CFI allows e-magazines, let alone Twitter and various other websites, then it would pass with flying colors.] 70.172.194.25 20:10, 18 February 2022 (UTC)Reply
The is a zine that had a phisical copy for one issue GTbot2007 (talk) 12:31, 13 September 2022 (UTC)Reply
If copyright mattered then why can Klingon get a appendix? 166.109.26.70 13:35, 3 March 2022 (UTC)Reply
Why not use the ISO 369-3 list? It has more languages. 166.109.26.70 13:32, 3 March 2022 (UTC)Reply

Edit request

[edit]

A new section, Further reading, is needed in which to keep the non-inline citation currently at the bottom of the References section. inqilābī [ inqilāb zindabād ] 15:01, 1 October 2020 (UTC)Reply

baseball bat

[edit]

Is there consensus to keep things like baseball bat, which is arguable "a bat used in baseball" therefore sum-of-parts? Troll Control (talk) 18:10, 20 December 2020 (UTC)Reply

We are influenced by what other dictionaries include. baseball bat”, in OneLook Dictionary Search. shows that some dictionaries besides Wiktionary include baseball bat (Wordnet 3.0, Vocabulary.com, the others do not link to dictionary entries for baseball bat.) It is notable that MWOnline, Oxford/Lexico, Cambridge, Collins, AHD, and RHU, all "unabridged" print-based dictionaries do not. We would almost always include a term in the "unabridged" ones did, but we often include terms that Wordnet includes no matter how much its meaning seems to directly follow from the meaning of its components. Including such terms seems to me to have the effect of pushing Wiktionary to become a short-attention-span encyclopedia, rather than a dictionary. Some of the arguments used to justify inclusion seem to imply that we need to include the term to help language learners who are not familiar with such a culturally specific item. There is also the argument that we need the entry to have a "translation hub". To the extent there is a consensus, I oppose it. DCDuring (talk) 17:22, 22 December 2020 (UTC)Reply

字 is not an Ideograph According to Wiktionary

[edit]

Hello! The sentence "Characters used in ideographic or phonetic writing such as 字 or ʃ." appears on this page and I would like to propose that it be modified. The character () is determinedly not an real ideograph or ideogram. The reason this conceptual error about Chinese characters exists is because of what the The Chinese Language: Fact and Fantasy calls the "The Ideographic Myth"- the idea that "Chinese characters represent ideas instead of sounds." As that work and this 4384 character-strong phono-semantic compound category amply demonstrate, that's nothing but a misconception about Chinese civilization passed down to us by the Enlightenment thinkers, who also used leeches to get the humors out. I have done my part to turn the wayward ship of human civilization around on this issue, but it's very hard to do because the allure to the imagination of a civilization whose communication never symbolizes sound in writing but relies solely upon pictures and abstractions is so exhilarating. A lot of money is made by liars who write ugly books that trick people with nonsense glyph origin explanations that tickle the fancy rather than ground themselves in scholarship. The definition of ideogram Wiktionary has says that an ideogram is "A picture or symbol which represents the idea of something without indicating the sequence of sounds used to pronounce it." (my emphasis) Well, excuse me, but Xu Shen long ago concluded that the character () DOES indicate the sequence of sounds used to pronounce it when he wrote, "从子在宀下,子亦聲。", that is 'the character 字 is composed of a child 子 below a roof 宀; and the 子 component also indicates the sound'. Oops! The recently departed Zhengzhang Shangfang in his 2003 work (Zhengzhang, Shangfang (鄭張尚芳) (2003). 上古音系 (Old Chinese Phonology). Shanghai: Shanghai Education Press.) also concluded that () has a phonetic component. Either the English Wiktionary definition for ideogram needs modification, or we don't have an ideograph here. For the sake of stopping the spread of the myth, I suggest substituting () with an unambiguous ideograph. The classic examples are (shàng) and (xià). --Geographyinitiative (talk) 19:15, 30 January 2021 (UTC)Reply

Chinese isn't the centre of the universe. It's certainly an ideogram in Japanese. —Μετάknowledgediscuss/deeds 20:38, 30 January 2021 (UTC)Reply
@Metaknowledge But even in Japanese some types of kanji usage like ateji are the cases where character itself and its meaning are irrelevant, the former playing only a phonetic role. As Geographyinitiative(corrected --Eryk Kij (talk) 19:22, 1 February 2021 (UTC)) claims, I think the current text might be a bit misleading. --Eryk Kij (talk) 09:33, 31 January 2021 (UTC)Reply
How is ateji relevant to this character? —Μετάknowledgediscuss/deeds 19:14, 31 January 2021 (UTC)Reply
@Metaknowledge You see an individual part and I (and maybe also Geographyinitiative) its system. You think that the character 字 itself is less likely used as component(s) of ateji, so there is no problem. Now I accept that you may be right. In my view, every kanji have potentiality of being used as ateji; it is a part of a system. I cannot assert that kanji is unconditionally ideographic. --Eryk Kij (talk) 19:26, 1 February 2021 (UTC)Reply
@Geographyinitiative What do you think is more appropriate substitute? Symbol like ? --Eryk Kij (talk) 19:17, 1 February 2021 (UTC)Reply
@Geographyinitiative 字源 (Li, 2012) considers it a 會意兼形聲字. It can be considered an ideogram, in light of senses 15, 16 and 18 on Wiktionary. RcAlex36 (talk) 18:14, 9 February 2021 (UTC)Reply
Incredible responses all! Thanks for looking into this.
1) MK wrote "It's certainly an ideogram in Japanese." I might contest that on several grounds, but I think one of the more convincing arguments is that I strongly suspect that Japanese literati would have had a copy of Shuowen Jiezi and hence the elites would know what's up- the 10% who pass Level 1 of Kanji Kentei will likely know about phonetic components. Based on my limited knowledge of Japanese, I would hazard that it will not have escaped the Japanese that the Kan-on pronunciation is し (shi, Jōyō) for both 字 and 子, and then some will draw the inference that 子's appearance within 字 may have a connection with the similarity of pronunciation- that this similarity of pronunciation between the two characters is beyond blind coincidence, and that 子 may be functioning as a phonetic indicator. I realized the truth about the importance and value of phonetic components by a similar realization between HSK 5 and low passing HSK 6.
2) I don't understand the ateji argument. I recommend 上 or 下 as potential substitutes for 字. ('initiative' is very hard to spell, but I am doing better on spelling it correctly since I've used this user name)
3) To my understanding, the question is not whether someone can consider 字 to be an ideogram based on one of the definitions of the character. Anything can be considered an ideogram if you ignore the existing phonetic component in the character. The question is: does the character convey meaning "without indicating the sequence of sounds used to pronounce it." 字 conveys its sound via the phonetic component 子 (which is itself a pictograph). The argument that 字 has a pure 会意字 (ideographic compound) side doesn't help because we are looking for a 指事字 (ideograph/ideogram) in this sentence. --Geographyinitiative (talk) 18:43, 9 February 2021 (UTC)Reply
@Geographyinitiative: In my previous comment, by ideograph I meant 會意字. Seeing that 字 has senses related to "child" (though they are obsolete now), the character can in my opinion be considered to be in part 會意字. And obviously Chinese characters were intended for the Chinese language when they were created. RcAlex36 (talk) 18:52, 9 February 2021 (UTC)Reply

Edit request: section for nouns in the Idiomatic phrases section

[edit]

I was just looking for guidance about how to create (or search for) an entry for an idiomatic phrase (in particular, the expression "as far as <x> go/goes"), and found a link to Wiktionary:Criteria for inclusion#Idiomatic phrases at the bottom of the body text of Category:English idioms. As I expected, there is guidance for various classes of words (pronouns, verbs and adverbs) but there was nothing about nouns.

Looking at some entries in Category:English idioms, it appears that the current practice for nouns is to substitute "something", as in give something a whirl, take something to the grave, or not touch something with a barge pole. Interestingly, in the latter's entry name (but not in the page title) the "something" is wrapped in parenthesis and not linked to something — perhaps it would be good if this section also suggested not adding such placeholder markers, which appears to be the prevailing standard.

Is there opening to add this content? I'd be happy to propose a wording if so. --Waldyrious (talk) 20:02, 8 March 2021 (UTC)Reply

Pinging the two most recent editors of the page for comments: Metaknowledge, Koavf. (Let me know if I should solicit feedback elsewhere.) --Waldyrious (talk) 12:03, 16 March 2021 (UTC)Reply
I couldn't say if there's an "opening", but all changes to this page have to reflect community consensus, by a vote (or for minor changes) after a discussion that reaches agreement. You can propose such minor changes at the Beer parlour. —Μετάknowledgediscuss/deeds 15:52, 16 March 2021 (UTC)Reply
Just to not ignore you, I have nothing to add to Metaknowledge. —Justin (koavf)TCM 22:40, 16 March 2021 (UTC)Reply
[edit]

On the Wiktionary:Requests for verification/English page, there is a blue link on the words "permanently recorded media". I have done a lot of citations/quotations. I think Wiktionary:Searchable external archives should be linked on the words "permanently recorded media" here too. (cf. [4]). --Geographyinitiative (talk) 20:21, 5 April 2021 (UTC)Reply

@Equinox, Kiwima, Surjection Friends: as evidenced in this edit [5] in which an IP says in part ""help me understand how you find "durable" citations and how to determine if a source is or isn't durable? For example, if a source is archived on archive.org does that make it durable?"", there is an impassible obstacle to IP users in understanding the nuanced phrase "permanently recorded media"/"durably archived". Such an obstacle creates a motivation problem for lower-level editors and IP editors, because they will always be under the impression that all their citations are subject to deletion by an elite user at any moment due to an holy and mysterious interpretation of the adjective "durable". I believe that this obstacle is lowered/leveled by the above proposed bluelink- not so much of a step up anymore. For the sake of Wiktionary's accessibility to lower-level users, please add this link to the page. --Geographyinitiative (talk) 00:34, 2 June 2021 (UTC) (modified)Reply
I made a start at testing the waters on this simple idea with the creation of a redirect for "Wiktionary:Permanently Recorded Media" (see history: [6]). This redirect is useful to the unwashed and uninitiated who know not what permanently recorded media is; hoping to encourage implementing my thoughts about a blue link for the words "permanently recorded media" . --Geographyinitiative (talk) 13:00, 16 February 2022 (UTC)Reply

Permanently Recorded Media Needs a Subsection

[edit]

Let's look at the structure of this page:
1.2 Attestation
1.2.1 Conveying meaning
1.2.2 Number of citations
1.2.3 Independent
1.2.4 Spanning at least a year
If you're going to explain to people what the word "spanning at least a year" means in a whole subsection, how can you ignore that "permanently recorded media"/"durably archived" needs a subsection?? It should be subsection 1.2.1, shifting the other ones down. --Geographyinitiative (talk) 19:59, 11 July 2021 (UTC)Reply

I had a revert a day or two ago [7], and I want to suggest that the reason that the revert happened is because there is a distinct lacuna of written guidance with reference to the method to show that an attestation is in 'permanently recorded media' or is 'durably archived'. As I noted in my 11 July comment above, every phrase and word in the Wiktionary Attestation requirements sentence is elaborated on in a subsection, but there is no subsection for the magic phrase 'permanently recorded media'. As an editor who has dug up a ton of citations, I recommend a written guide to how to fulfill the permanently recorded media requirement, and not just a collection of websites or archives to go to- like literally, how do I prove to Wiktionary standard, using what ISBN, OCLC, OL, LCCN, DOI, JSTOR, etc that this is PRM. My current practice is "throw the kitchen sink at it" in the sense that I just add every identifying method and number not just as a way to verify something is PRM, but also just as a reference for interested persons. Also, you never know when one resource might become unavailable, or some situation might change in coming years or decades, etc. Here's an example of an ISBN that (currently) has no books in an OCLC library (showing that ISBN is not proof of an OCLC): 978-981-16-1725-6, 978-1-7372490-1-6. Also, the existence of an OCLC number doesn't prove that anything is really permanently recorded- see for instance →OCLC or →OCLC. --Geographyinitiative (talk) 01:22, 15 August 2021 (UTC)Reply

Usenet

[edit]

"As Wiktionary is an online dictionary, this naturally favors media such as Usenet groups..." I suspect that most people these days don't know what Usenet is (or was). Should that phrase be updated? Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 18:21, 19 October 2021 (UTC)Reply

I agree. Usenet is still around, but its userbase must be tiny compared to modern Internet-based platforms. The Google Groups homepage doesnt even mention Usenet. But the impression I get is that while Google Groups may come and go like the wind, Usenet posts are never (or almost never) deleted, and therefore we cannot include the whole of Google Groups as durably archived, but we can include Usenet posts.Soap 15:54, 20 August 2022 (UTC)Reply

Given and family names: just a bit noninclusive?

[edit]

WT:Criteria for inclusion#Given and family names says: "Given names (such as David, Roger, and Peter) and family names (such as Baker, Bush, Rice, Smith, and Jones)".

I'm not usually too worried about PC "politically correct", but sometimes in modern times lists or examples just stand out as being a bit noninclusive. One can also say this of "Karen, Lisa, Sharon" or "Gonzales, Rodriguez and Lopez". Cheers, Facts707 (talk) 03:33, 25 November 2021 (UTC)Reply

No harm in changing the three given names to (say) male, female and neutral, like "David, Rebecca, Leslie" (just my examples; don't yell at me for putting the man first, or not including a name from your favourite culture). You can never please everybody. Equinox 06:15, 25 November 2021 (UTC)Reply
I have gone ahead and changed these; happy to self-revert (or have someone else revert) if there are objections. —Granger (talk · contribs) 17:14, 25 November 2021 (UTC)Reply
Big yikes. As a mixed-race Kazakh-Polynesian Jew, I feel severely underrepresented by this new list. (Jokes aside, this is a total nothingburger.) — Mnemosientje (t · c) 11:51, 28 November 2021 (UTC)Reply

Are theses/dissertations considered durable?

[edit]

Assuming they are available online for free, e.g. on ProQuest. 70.172.194.25 22:44, 30 December 2021 (UTC)Reply

Yes. Whether or not they are on ProQuest, they are stored forever at their respective universities (traditionally on microfilm), and therefore are more durable than many books! —Μετάknowledgediscuss/deeds 07:23, 31 December 2021 (UTC)Reply
I concur. bd2412 T 08:11, 31 December 2021 (UTC)Reply
I deny that theses on microfilm are durably published, or even more durable than books: due to how humans treat microform. In the 21st century microfiches are, in accordance with the art of the archivist and librarian, thrown away and burnt, due to the maintenance costs of the machines being to high and the convenience out of date, even under health concerns. Often they are squirmworthy politicians’ dissertations not desired to be widely available in the first place. Even if they were durably published, we have the dilemma of something being durable not being available, in spite of being published technically as required by university statutes—not always required. Though then you might never get to quote the work.
However if an university publishes a dissertation via the internet it is durable, for durability is an institutional guarantee.
So, here we have two legal theories about durability. The most vulgar one is that “materialization” constitutes durability; my position here is that “institution” does, or community, humans carrying over material, who may be academics or only fan communities (as of rap songs that are deleted often). This has the counterintuitive result that language may be durably published in no clear location but is just the flow of information, and may even cease being durably published, i.e. all is only for the time being durably published, but the materialization theory is demonstrably incorrect: The colleagues of Homer have not been durably published, in spite of having had their works materialized. Fay Freak (talk) 20:21, 3 January 2022 (UTC)Reply

Are archived TV news programmes considered durable?

[edit]

For example, if they are viewable in the archive.org TV news collection. 70.172.194.25 06:47, 2 January 2022 (UTC)Reply

GitHub Arctic Code Vault

[edit]

Are GitHub repositories that have been backed up to the Arctic World Archive considered durably archived? Note that I am specifically referring to natural language content, such as documentation, found within such repos, not just code written in computer languages. Even deletion from GitHub does not affect the preservation of the data itself: "For the GitHub Arctic Code Vault, we are unable to remove data that has already been stored" [8]. It's hard to beat storage in an arctic vault for long-term preservation, not to mention the fact that anyone can easily make a local copy of a GitHub repo including all history.

I'm guessing this is not considered acceptable under CFI, though. 70.172.194.25 21:55, 15 January 2022 (UTC)Reply

Request for editing the locked page

[edit]

@User:Surjection @User:Saltmarsh Remove explicit interwiki links. They are at wikidata now. Taylor 49 (talk) 15:44, 18 June 2022 (UTC)Reply

Done DoneSURJECTION / T / C / L / 16:49, 18 June 2022 (UTC)Reply

Minor edit request

[edit]

Can someone with permissions please add WT:DEROG as a shortcut of WT:DEROGATORY? "WT:DEROGATORY" is rather long and thus it seemed like it would help to have an easier-to-type secondary shortcut. WordyAndNerdy (talk) 23:03, 1 August 2022 (UTC)Reply

"Omit an initial article unless it makes a difference in the meaning. E.g., cat’s pajamas instead of the cat’s pajamas."

[edit]

What on earth is the difference? I looked them both up: they now redirect to the same definition. —DIV (1.145.119.99 06:04, 22 September 2022 (UTC))Reply

Maybe it wasnt the best example ... e.g. we could have used bomb and the bomb .... but it's still valid in that *a cat's pajamas has no meaning apart from the literal sense. The correct idiom is the cat's pajamas and I think it should be moved there. Soap 11:07, 18 March 2023 (UTC)Reply

SOP or SoP

[edit]

Is there a consensus on using the abbreviation SOP over SoP? —DIV (1.145.119.99 06:05, 22 September 2022 (UTC))Reply

A term to be included if someone is likely to want to know what it means

[edit]

"A term should be included if it's likely that someone would run across it and want to know what it means": Untrue, especially if the "if" is interpreted as "if and only if". For a start, Wiktionary includes individual letters as entries, and they are not the kind of things for which one wants to know what they mean. Second, it implies the meaning is the only kind of non-compositional information one may want to look up in a dictionary, but that is also untrue: there is etymology, spelling, pronunciation, inflection, etc. Third, if proper names do not have "meaning" (that is, if the instances to which they refer are not considered to be "meaning"), inclusion of proper names is another falsifier/refuter of the sentence.

Ideally to be dropped, which was tried in the past without success. But there is always hope. --Dan Polansky (talk) 09:50, 2 January 2023 (UTC)Reply

Potential Exception to "We do not quote other Wikimedia sites (such as Wikipedia)": Brazilian aardvark

[edit]

Check out Wikipedia:List of citogenesis incidents. It would be a shame to omit a cite showing the true origin of a word like Brazilian aardvark just because of this rule. I believe that this rule is too broadly worded and doesn't capture the actual meaning from [9] and [10]. ("Hold my diff." & diff) I'm not saying that people should quote Wikipedia diffs on the Wiktionary mainspace, but here I have an example of a sense for a word that seems to originate at English Wikipedia. Are you saying that Wiktionary shouldn't be able to quote that, at least on the Citations page? I think those votes don't mean that, but that the rule literally means that. Anyway, just a thought. See also Wiktionary:Beer_parlour/2023/May#Quoting_Wikipedia_Diffs_in_Wiktionary:_An_Exception. --Geographyinitiative (talk) 11:34, 2 May 2023 (UTC) (Modified)Reply

Chemical formulae: CO2?

[edit]

At Wiktionary:Criteria_for_inclusion#Chemical_formulae we have

a murder mystery saying "the air in his scuba tank had been replaced with CO2" could support CO₂.

I am not convinced that this is a good example. Wouldn't relevant instances of "CO2" being used support an entry for CO2#Translingual or possibly CO2#English, not CO₂#Translingual? In the present case the former is hard-redirected to the latter anyway, but I'm not sure that resolves the principle of the matter. —DIV (1.145.124.86 01:46, 15 February 2024 (UTC))Reply

"Roughly speaking"-->Independent requirement in WT:ATTEST

[edit]

The "Independent" requirement includes the words "Roughly speaking" ("Roughly speaking, we generally consider two uses of a term to be "independent" if they are in different sentences by different people")- and thank God it does include those two words, because otherwise, anonymously written works from Reuters, AP News, etc in newspapers with OCLC numbers would be totally worthless for WT:ATTEST, because there is not a record of the authors of those reports and hence no way to gauge independence from the other two works required for attestation. I remember Equinox writing once that works from two different government agencies might be allowed to stand under this requirement, but of course, what if the translator switched jobs between departments? I'm interested in this because I explore some somewhat rare terms that appear prominently in semi-anonymous literature and translations of Chinese. I would take into account not just authors, but translators as well. It's a tough requirement for me sometimes. --Geographyinitiative (talk) 09:30, 1 July 2024 (UTC)Reply