This workshop is supposed to teach you:
In the examples we will work on the Estonian Wikisource and refer to the corresponding pages, if they have been translated to Estonian. Some help pages are not translated, though, and so we will refer to the pages in English.
We will now show you how you can contribute to Wikisource. There are a number of tasks from a (possibly printed) source material to the final version of the text appearing on Wikisource, and you can contribute to each of them separately.
For more detailed instructions on how to contribute, go to Wikisource help on adding texts.
Texts on Wikisource are based on original sources, which can be printed books or journals, for example. In order to publish these texts, they need to be available in digital form. If you have prined material only, there is a whole set of instruction on digitising texts and images, which we will not cover in this workshop. Instead, we will assume that you already have a digital copy. Files in the DjVu format are preferred.
Instead of digitising the text yourself, you can also check whether this has been done already, and the text is available and free to use in an online repository. For texts in Estonian, you can try these archives:
Make sure that the text you wish to upload satisfies the inclusion guidelines. Artistic works must be published in a medium which undergoes editorial review, such as a novel published by a publishing company; it does not apply to self-published works. Also journal articles or official documents issued by administrative authorities can be uploaded. Refer to the link above for details.
Before uploading, make sure that you have permission to do so. This workshop is not a legal advise, as an introduction to copyright would need a workshop on its own; for details, refer to Wikisource help on copyright. To give you a few examples of texts you may upload:
In Estonia, the latter includes works whose author is deceased at least 70 years ago.
Once you have obtained a digital copy which you are permitted to upload, it should be uploaded to Wikimedia Commons. Go to the upload wizard, and complete the following steps:
You are also asked for a name for the newly created file page. Use a consistent naming scheme, which makes it easy to uniquely identify the text, bu be brief. For example, Title Author Year.djvu is a good scheme. Wikimedia Commons will prepend the prefix File: to this name.
Once the file is uploaded and the data is complete, you will see the file page.
Example: Assume you have uploaded the book Ülemiste vanake by Oskar Luts, and named the page Ülemiste vanake Luts 1919.djvu. Have a look at File:Ülemiste vanake Luts 1919.djvu on Wikimedia Commons for what the result should look like.
In order to import the text to Wikisource, the next step is to create an index page. This page serves as an overview for coordinating the following steps. This page will be created in the Wikisource in the language of the text you uploaded - in this workshop, this will be the Estonian Wkisource. To create the index page, replace Esileht in the link above with Register:XXX.djvu, where XXX.djvu must be the name of the file page on Wikimedia Commons.
Example: For the book Ülemiste vanake mentioned before, it shall be Register:Ülemiste vanake Luts 1919.djvu.
Now you are again asked to provide some information, such as:
The most important (and most complicated) field for further processing is the "pages" field. It should contain a description of which pages are included in the following process and which are not (because they are empty, for example) and their page numbering. In the most simple case, this will simply be <pagelist/>, which includes all pages and numbers then sequentially, starting from 1. Often you want to change this, to take into account an extra name for a cover page, unnumbered / empty pages, different numbering schemes (Roman vs Arabic) for the table of contents… See the pagelist tag for detailed help.
Once you have created the index page, it should show the cover image, the metadata you entered and a list of pages. The page numbers are shown in red - this will change in the next step.
Example: Have a look at Register:Ülemiste vanake Luts 1919.djvu. Here the list of pages reads <pagelist 1=Cover 2=- 3=1 32to36=-/>, which means:
NB! Index pages are prefixed with Register: in the Estonian Wikisource. This prefix depends on the language. In the English Wikisource, it is Index:.
The most tedious task of importing texts is proofreading. The files uploaded to Wikimedia Commons and to be imported are processed with text recognition (OCR), but this is far from perfect. Hence, each page will need to undergo manual reading and editing if necessary, so that the text will match the uploaded original. To proofread a page, click on its page number. If the page has not yet been proofread (indicated by a red page number), it will take you to the proofreading editor. Most importantly, this page shows you:
Read carefully through the text in the text box, compare it to the scanned page and make corrections.
The text should also preserve the formatting and style of the original page; this is a whole topic on its own.
Note that templates are language specific, the ones linked here are in Estonian.
Once you are satisfied with the result, go to the bottom of the page. Here you can click the button to preview the page. Note that it is not yet saved, so you will lose your data if you leave the page. You will find a few colored radio buttons here as well, where you can indicate the status of the page. The yellow color indicates that the page has been proofread. If you click it, it will also add a summary of your edit in the text box, which you can leave as it is. Finally, click the button to publish the page.
When you are done, you will see the complete page, with header, content and footer, and a yellow information that this page has been proofread. In will also appear in yellow on the index page now.
Note that often there are also pages without text in an uploaded document, such as empty pages or pages with images only. In this case, choose the gray radio button instead, which indicates that there is no text.
You can also mark pages as problematic, i.e., needing discussion (blue) or work in progress (red) if you are not finished proofreading. See the page status help for an overview and explanation of page statuses.
Example: Take a look at page 8 of Ülemiste vanake for an example.
Once a page is proofread and published, it will appear formatted and is available to be read. To improve content quality, Wikisource includes another step to validate contents of the page. This means that a second person should read the proofread page, again carefully compare to the original scanned page, and confirm that the proofread text is correct.
To validate a page, choose a proofread (yellow) page from the index, and click on the page number. This will show you the page contents. Read and compare, and then edit the page. Make any corrections of necessary. Finally, go to the bottom of the editor. Select the green radio button to indicate that the page is validated. (NB! This button is not available to the person who proofread the page; validating must be performed by another person.) This will also fill the edit summary field. Submit when you are done.
When you are done, you will see the page again with a green information, indicating that it is now validated. It will also appear green on the index page.
The steps done so far will give you a number of pages containing the content of the document. In order to present a contiguous text to the reader, these individual pages need to be transcluded. This means that you will create one or several pages which contain the final content of the document. For short documents, this can be a single page. For longer works, it can be one page for each chapter or section.
To start transcluding, create a new page on the Wikisource of the language you are working on. In this case, there will be no prefix (like File: or Register:). The title of the page will most simply be just the title of the document, in case of a single page, or have the form Title/Chapter for longer works containing several chapters.
In case of several chapters, there should also be a title page, which has a list of chapters, and its name should be the title of the work. Either this title page, or the single pade in the case of a shorter document, should start with a header template with (at least) the name of the author and the title of the work.
The actual content is not copied here, but "transcluded" from the digitised pages with the pages tag. This tag lists a range of pages whose content should be included here; see the example below and the linked help page for details. This tag has several parameters, the most important ones are:
index="XXX.djvu" refers to the source from where the pages should be transcluded. Here XXX.djvu should be the name of the file you uploaded.from=x and to=y are the first and last pages to be transcluded. Note that x and y are not the printed page numbers, but the logical page numbers, counted from 1 for the first page of the book.Finally, you may wish to include categories so that your work can be found more easily. Common categories include the publishing year or the type of work.
Example: Have a look at the page Ülemiste vanake for an example, and check out the source:
{{Päis|autor = Oskar Luts | pealkiri = Ülemiste vanake}} creates the header, in this case containing author and title.<pages index="Ülemiste vanake Luts 1919.djvu" from=5 to=31/> transcludes pages 5 to 31 - these have the content of the book.[[Kategooria:Näidendid]] and [[Kategooria:1919]] include this page in the categories 1919 and Näidendid.Author pages gather the works written by the same author and provide an overview and introduction point.
To create an author page, create a new page in your language's Wikisource, whose title has the form Autor:Name (in the Estonian Wikisource; in English the prefix is Author: instead). This page should start with an author template, which included some key information, such as the name, years of birth and death, a link to Wikipedia and the name of an image file from Wikimedia Commons, which will then appear on the author page.
Below the header, list the works of the author. Here you can use the usual Mediawiki syntax to create sections (starting from section level 2) and links to refer to the transcluded works, e.g., [Title] will link to the page named Title.
Example: See the author page of Oskar Luts for an example.