Skip to content

PicaReader provides classes for reading Pica+ records encoded in PicaXML and PicaPlain.

License

Notifications You must be signed in to change notification settings

cKlee/PicaReader

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

18 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

PicaReader – Classes for reading Pica+ records

About

PicaReader provides classes for reading Pica+ records encoded in PicaXML and PicaPlain.

PicaReader is copyright (c) 2012 by Herzog August Bibliothek Wolfenbüttel and released under the terms of the GNU General Public License v3.

Installation

You can install PicaReader via Composer.

composer require hab/picareader

Usage

All readers adhere to the same interface. You open the reader with a string of input data by calling Reader::open() and can call Reader::read() to read the next record in the input data. If the input does not contain (anymore) records Reader::read() returns FALSE. Otherwise it returns either a record object created with PicaRecord’s Record::factory() function.

$reader = new \HAB\Pica\Reader\PicaXmlReader()
$reader->open(file_get_contents('http://unapi.gbv.de?id=opac-de-23:ppn:635012286&format=picaxml'));
$record = $reader->read();
$reader->close();

To filter out records or fields you can attach a filter to the reader via Reader::setFilter(). A filter is any valid PHP callback that takes an associative array representing the record as argument and returns a possibly modified array or FALSE if the entire record should be skipped.

The array representation of a record is defined as follows:

RECORD   := array('fields' => array(FIELD, …))
FIELD    := array('tag' => TAG, 'occurrence' => OCCURRENCE, 'subfields' => array(SUBFIELD, …))
SUBFIELD := array('code' => CODE, 'value' => VALUE)

Where TAG, OCCURRENCE, CODE, and VALUE are the respective properties of a Pica+ field or subfield.

For example, if your source delivers malformed PicaXML records like so:

<?xml version="1.0" encoding="UTF-8"?>
<record xmlns="info:srw/schema/5/picaXML-v1.0">
  <datafield tag="">
  </datafield>
  <datafield tag="001A">
    <subfield code="0">0001:14-09-10</subfield>
  </datafield>
  …
</record>

You can attach a filter function to remove these fields with an invalid tag:

$reader = new PicaXmlReader();
$reader->setFilter(function (array $r) { 
    return array('fields' => array_filter($r['fields'],
                                          function (array $f) {
                                            return isset($f['tag']) && \HAB\Pica\Record\Field::isValidFieldTag($f['tag']);
                                          }));
  });
$record = $reader->read(…);
$reader->close();

Acknowledgements

Large parts of this package would not have been possible without studying the source of Pica::Record, an open source Perl library for handling Pica+ records by Jakob Voß, and the practical knowledge of our library’s catalogers.

Footnotes

About

PicaReader provides classes for reading Pica+ records encoded in PicaXML and PicaPlain.

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages