Digitisation specifications for paper records in Public offices
Technical requirements set out in this specification aim to ensure that digitisation efforts result in the creation of authentic, reliable, and usable digitised copies of paper records in accordance with General retention and disposal authority: Original or source records that have been copied (GA45).
When to use this specification?
This specifications support digitisation efforts, including:
- business process digitisation,
- back-capture projects, and
- scanning of incoming records i.e. mail / correspondence
that involve State records in paper formats, including:
- maps and plans
- bound or unbound volumes, registers, publications etc.
- photographs that are embedded or within a document
The specification does not apply to paper-based State archives created prior to 1980, photographic series, film or audio-visual material.
Whilst it is not mandatory to adopt the specifications, doing so will enable your public office to meet the copying provisions set-out in General retention and disposal authority: Original or source records that have been copied (GA45) without further analysis of technical requirements (see variations on specifications).
This specification sets out the recommended minimum technical requirements for digitisation, including:
- Colour mode and bit-depth (bi-tonal, greyscale, colour)
- Resolution (pixels per inch @1:1 ratio)
- Compression type
- File format(s)
- Colour management
Note: specifications relate to the final output, input or raw data should be equal to or greater than the output.
|Document Type||Bit depth / Colour mode||
@1:1 ratio / 100% scale (original size)
Black and white, clean, high contrast documents, word processed, contains text and art line only.
Note: not all black and white documents have the appropriate level of contrast for bi-tonal scanning. Testing may be required. If images result in missing detail use greyscale settings.
1 bit Bi-tonal
Smart or lossy compression methods can be used provided artefacts are minimalised i.e. high quality threshold is applied or files and files are managed to reduce degeneration.
|Greyscale or black and white documents. Including those that contain watermarks, grey shading, and grey graphics.||8 bit Greyscale|
|Documents with discrete colour used in text or diagrams and coloured documents||
24 bit Colour
(8 bits per channel)
Public offices should select file formats that are most appropriate for their purpose.
File formats must:
- be suited for long-term sustainability and accessibility (for a list of sustainable formats consult our guidance on sustainable file formats)
- meet compression requirements
Additional considerations for file formats include:
- ability to hold metadata
- ability for text / optical character recognition (OCR)
- page display (single vs multipage page)
- compatibility with software programs
Where the imaging device has the ability to assign an ICC profile/colour space, it is recommended to apply sRGB settings to colour images.
Input vs. Output
Specifications reflect the minimum output required.
To avoid the adverse effects that up-sampling and/or interpolation has on image quality and integrity, workflows should be designed to ensure input data equals or exceeds the final output. Up-sampling should be avoided through the selection of all technical components (including resolution, bit depth and compression).
Up-sampled or interpolated images do not meet the requirements of this specification.
Tip: Optical resolution indicates a devices maximum resolution without interpolation.
Use of image enhancements (sharping, background removal, auto-colour etc.) should be tested for suitability prior to use. To retain the records authenticity it is preferable to minimise the use of enhancements when good quality images can be obtained without.
Variations on specifications
The above technical specification is intended to enable easy conformance with requirements under General retention and disposal authority: Original or source records that have been copied (GA45), i.e. where there is the intent to destroy the physical originals after digitisation, and meets expectations for quality for archival transfer.
Public offices can choose to relax requirements for short term records (records are required to be retained for 10 years or less) or reference use copies. The decision to do so should be reached after confirming that proposed specifications are appropriate to meet all reasonable business uses including text/ optical character recognition (OCR) for content search/ discoverability.
The decision to lower specifications should be documented and supported by useability testing.
It may be appropriate to increase technical specifications depending on the format and amount of detail contained in the physical record. For example a higher resolution might be needed to ensure fine detail/print on a map can be reproduced/ is legible.
Image capture specifications should be documented, for example within procedures for on-going digitisation activities, and project requirements for back-capture projects. Documenting requirements supports the Public office’s record of digitisation activities.
Published January 2020