|
Introduction
The following tip describes the processes involved in importing information from a Microsoft® Excel® file into a TextWorks textbase. The content and format of an Excel file can vary considerably, so it is not possible to spell out all of the possible variations that you may encounter in importing such a file into TextWorks. Nevertheless this tip covers all of the most commonly encountered situations.
Importing from an Excel file is a procedure used by many Maxus customers. If you have recently installed TextWorks and your information is currently stored in an Excel file, importing that content straight in a TextWorks textbase will save you a great deal of time and double-handling. Likewise, if you are moving to TextWorks from another software product, it is highly likely that you can export your data from the other product in a spreadsheet format that lends itself to a fast and easy import into TextWorks.
Note: It is recommended that you back up your textbase before carrying out any import.
Points to note in Excel
- The column headings in Excel must be identical to field names in the TextWorks textbase into which the data will be imported. If they are not identical, the column headings should be changed before importing. If your Excel file does not have column headings that match the field names in your textbase, add a row to your Excel file and add the field names into the top cell in each column.
- The records in the import file must not contain more fields than the textbase into which they are being imported, otherwise the record will be rejected. If there are more columns in the Excel file than fields in the textbase, you should delete the extra columns.
- If you have multiple entries within a column/field (e.g. several subject headings) make sure these are not separated with a carriage return. The default character for importing multiple entries into TextWorks is a vertical bar (|). For example: Costumes|Art history|Buildings.
- Save the Excel file as a CSV file (comma separated values). To do this, the Excel command is File>Save As then use the Save as type droplist to select CSV (Comma delimited) (*.csv) as the file type:

How to import the file into TextWorks
- Open your textbase and then click on File>Import
- The Select Import File dialogue appears. Select the file you wish to import then click Open.
- In the Import Options dialogue, select File Format and specify Delimited ASCII format.
- Once you have specified Delimited ASCII format, the Delimiter Options and Field Names sections will appear. To specify the appropriate delimiter characters that are contained in your CSV file, select options from the droplists. {CR}{LF} denotes a carriage return and line feed. {LF} denotes a line feed only. {TAB} indicates a Tab character.
- A standard Excel CSV file defaults to {CR}{LF} in the Record Separator setting, , (comma) in the Field Separator setting, and " (double quote marks) in the Quote Character setting. You will only need to use different settings if your CSV file does not use these options. The following screen shot shows which settings to use in the File Format tab of the Import Options dialogue if your CSV file is a standard one:

- If you need to see which characters are used in your CSV file, open the file in a program like Microsoft Notepad. This will enable you to see the formatting characters clearly. (If you open the CSV file in Excel, you will see the contents in spreadsheet cells but you will not be able to see the formatting characters that are used by Excel to interpret the contents of the file.)
- Use the Record Separator setting to specify the character(s) used to indicate where each record ends.
- Use the Entry Separator setting to specify the character(s) used between multiple (repeating) entries in a field. As mentioned earlier, the default is a vertical bar (|). The following example shows three entries separated by a vertical bar:
Information technology|Personal computers|Programming
- Use the Quote Character setting to specify which character is used to enclose field information. For example, a quotation mark (") is used here:
"Smith","146 High Street","North Sydney","NSW"
Select {NONE} if field information is not enclosed, for example:
Smith,146 High Street,North Sydney,NSW
If a different character is used, type it in the box. For example, type ' (a single quotation mark) if your input file looks like this:
'Smith','146 High Street','North Sydney','NSW'
Note: To embed a quote character in the data without having it interpreted as a quote character, you will need to make sure it is typed twice wherever it occurs in the CSV file, as with the double set of quotation marks below:
"Smith (aka ""Smithy"")","146 High Street","North Sydney","NSW"
NOTE: When a file is saved from Excel in CSV format, fields containing quotes or commas are automatically enclosed in double quotes, as shown in the example immediately above.
- Use the Field Separator setting to specify which character is used between fields to indicate where each field ends. The default field separator is a comma (which is why the file type is known as "comma separated values"). Be sure to use a character that would never appear in the content of a field. If a field contains a character that you have specified as a field separator, the field information needs to have been surrounded with the specified Quote Character. The following example uses quotation marks to cause the comma after 'Smith' to be interpreted as a comma instead of as a field separator:
"Smith, John",Introduction to computer interface design,1987
- Information preceded by the specified comment character will be ignored during the import. Information is ignored from the comment character up until the Record Separator character (usually a line break). Comment lines are generally used to provide information to other people reading the file – they are not intended to be interpreted as record information. The default comment character is an exclamation mark (!).
- If a field contains a comment character such as an exclamation mark, which you want to be interpreted as data, not as a comment delimiter, make sure the phrase has been enclosed in the specified Quote Character. For example:
Butler,"Howzat!: sixteen Australian cricketers talk to Keith Butler",Koala Press,1995
- Make sure you select First Row Contains Field Names as the Field Names setting.
Add/Replace options
If you wish to check the incoming records against records already in the textbase, tick the box marked Check for Matching Records. Leave this option clear if you are adding new records and you are sure that the records in the import file do not duplicate existing records (or other records in the import file).
If you ticked the Check for Matching Records box, you need to specify what to do if a match is found from the options:
- Reject New Record to avoid duplicates.
- Append Field Entries to add new field entries in duplicate records.
- Replace Fields to change field information.
- Replace Old Record to replace existing records.
- Delete Old Record to delete existing records.
If you are matching records, select a field or fields to be used for the match. Use the arrow buttons to move one or more fields from the Term-Indexed Fields list into the Incoming Records Must Match list. It is important to use a field that is likely to be unique. You can use more than one field if necessary.
Validation
If one or more fields have content validation, incoming records must conform to that validation. If no fields have validation, the validation import options are disabled.
Records that are rejected during an import are copied to the exception file.
Exception file
An exception file is a text file that contains any records that are rejected during an import together with the reason why they were rejected. Unless you specify otherwise, the exception file is saved in the same directory as the import file and uses the same name (with the extension .Xnn, starting with .X01, then .X02 etc).
You can correct records in the exception file, save the corrected file, and then import the corrected exception file into the textbase. (In the Select Import File dialogue, remember to select Exception Files (*.x*) from the Files of Type droplist to display exception files.)
Strip leading spaces
Leading spaces are spaces or tabs in the import file at the beginning of continuation lines – they appear immediately after a new entry mark, and after line breaks in the data. Keep leading spaces if you want to retain formatting such as margins or other indentation.
Import Method
The Import Method determines the speed of the import and whether other users can use the indexes during an import.
Express Import
This method provides the fastest imports because TextWorks appropriates as much of the system resources as possible.
Interruptible Import
This results in significantly slower imports. However, it allows others to use the textbase during the import.
Begin the Import
After specifying options, click OK to display the confirmation message to which you can click Yes to begin the import or No to return to the Import Options dialogue.
If you notice that most or all of the records are being rejected during the import, you can click the Stop Import Operation button to interrupt the import.
When the import is complete, you will receive a message with the results. If you used an exception file, and any records were rejected, the message gives the name and location of the exception file. When you click OK to dismiss the message, TextWorks automatically opens the exception file in a Microsoft Notepad window (unless the file is too large for Notepad!). Make changes to the exception file as described above, then import the exception file.
How can you tell that you have successfully imported all of your records?
When you reach the point at which TextWorks imports all of your records cleanly (that is, the software tells you that no records have been rejected and no exception file has been created), your import operation is complete and all of your records have been transferred from your Excel file to your textbase.
Maxus Australia Pty Ltd -
Contact us for any further information or other tips.
|