#!/usr/bin/perl
=head1 NAME

paddpdb2bib -- convert PDB file from your address book to LaTeX for printing

=head1 SYNOPSIS

paddpdb2bib [--encfrom=...] [--encto=...] [--first-last] [ <custom field mappings> ] AddressDB.pdb

=cut

use strict;
use warnings;

use Getopt::Long;
use Pod::Usage;
use Carp;

use Palm::PDB;
use Palm::Address;

use HTTP::Date;

use Encode('from_to');
my $VERSION = sprintf "%d.%03d", q$Revision: 1.7 $ =~ /(\d+)/g;

use constant {
	BUG_2000 => 1900, # what to add to a birthday year under 100.
	PHONE_SEP => ';\newline ',
};
my %BIBFIELD_BY_PADDLABEL = (
		"Work"	=> 'w.phone',
		"Home"	=> 'p.phone',
		"Fax"	=> 'p.fax',
		"Other"	=> 'r.phone',
		"E-mail"=> 'p.email',
		"Main"	=> 'w.phone',
		"Pager" => 'p.cellular',
		"Mobile"=> 'p.cellular',
	);

my %AMBIGUOUS = (
	'Main' => 1,
	'Pager' => 1,
);

=head1 DESCRIPTION

Given a PDB file from one's Address Book on a pilot device (one that
is parseable by Palm::Address), output a database in bibtex format 
that is suitable for processing via the F<directory> LaTeX macro package.

=cut

my $DEBUG = 0;
my $ENC_FROM = "cp1251";
my $ENC_TO = $ENC_FROM;
#my $ENC_TO = "koi8-r";

my $URL = '';
my $BIRTHDAY = '';
my $FATHER_NAME = '';
my $SPOUSE = '';
my $LAST_FIRST;

{
	my $first_last = '';
	GetOptions(
		'help|h|?' => sub { pod2usage(1) },
		'man' => sub { pod2usage(-exitstatus => 0, -verbose => 2); },
		'debug=i' => \$DEBUG,
		'encfrom=s' => \$ENC_FROM,
		'encto=s' => \$ENC_TO,
		'first-last!' => \$first_last,
		'version|V|v!' => sub { pod2usage(-message => "$0 version $VERSION",
			-exitstatus => 0, -verbose => 0); },
		'url=s' => \$URL,
		'birthday=s' => \$BIRTHDAY,
		'middle|father=s' => \$FATHER_NAME,
		'spouse=s' => \$SPOUSE,
		#	(map {
		#		("$_=s" => \$LABELS{$_});
		#	} @LABEL_FIELDS),
		)
		or pod2usage();
	$LAST_FIRST = !$first_last;
}
if ($#ARGV != 0) { pod2usage(); }
my @unmapped_custom 
= grep { $_ ne $URL 
	&& $_ ne $BIRTHDAY 
	&& $_ ne $FATHER_NAME
	&& $_ ne $SPOUSE 
	} map { "custom$_" } 1..4;

=head1 OPTIONS

=over 4

=item B<h>

=item B<help>

=item B<?>

Print usage instructions to STDERR.

=item B<man>

Print the manual page to STDOUT.

=item B<v>

=item B<V>

=item B<version>

Print the version information to STDOUT.

=item B<encfrom>

Name of the encoding from which to transcode the text fields in the PDB.

=item B<encto>

Name of the encoding to which to transcode the output bibtex fields text.
Defaults to the same encoding as given by the I<encfrom> option.

=item B<url>

=item B<birthday>

=item B<father> or B<middle>

=item B<spouse>

These 4 options provide a way to map the custom fields to URL, birthday,
patronymic, and spouse name, respectively. An unmapped custom fields will
be added to the note field, labeled the same way as they are custom named
in the PDB.

=item B<first-last>

Boolean option, if set, the names will be output in the 
I<First Last>
format.
Otherwise (the default), they are output as 
I<Last, First>. The option is there since some documents
consider this the ``preferred named format''.

If the middle name field mapping is specified, the output becomes
I<First Middle Last>,
and
I<Last, First Middle> 
respectively. Note that in pathological cases (e.g., if somebody
has a first name and a patronymic, but not a last name in your
database) this may affect the sorting order in a wrong way,
because bibtex has no way to distinguish a bare I<First Middle>
from a legal I<First Last>.

Finally, in the I<First ... Last> format
sometimes the first and last names are arbitrarily guessed
by the bibtex
the wrong way around, and thus the corresponding contact is sorted
wrongly as well. Using the default I<Last, First ...> format
usually helps in such cases.


=back

=cut


print '% vim:fileencoding=', "$ENC_TO\n";

my $file = shift;
my $pdb = Palm::PDB->new();
$pdb->Load($file);
my $categories   = $pdb->{appinfo}{categories};

print <<"EOF";
% This is a bibtex database produced automatically 
% from $file, an address book in the Palm Pilot format,
% by paddpdb2bib(1), v. $VERSION.
% The result is designed to be used together with the
% "directory" LaTeX macro package for formatting
% or as a source of address information for your
% LaTeX documents.
EOF
for my $record (@{$pdb->{"records"}}) {
## 	print encode(
## 		$ENC_TO,
## 		decode(
## 			$ENC_FROM,
## 			$record->{fields}{name}. ', '. $record->{fields}{firstName}. "\n")
## 	);
	my $f = $record->{fields};
	for (values %$f) {
		from_to($_, $ENC_FROM, $ENC_TO);
		s/^[\s\n]+$//g; # zero blank fields
# escape nasties from the input string so as not to break the bibtex fields (delimiters etc.)
		s/([%\_\^\&\#\$\\])/\\$1/g;
		s/\n+$//g; # chop the trailing newline to prevent underfull hbox
		s/\n(\s*\n)+/\\dirbreak\n/g; # several conseq. newlines is a new paragraph
		s/"/''/g;
		#s'~'$\tilde$'g; # TODO replace everywhere except the URLs
		s/\n/\\newline\n/g;
	}

=head1 BIBTEX FIELDS

Follows a snapshot from the documentation of the I<directory>
LaTeX package, on which the fields emitted by this script
are marked with a corresponding comment in the end,
detailing from which Palm::Address fields they are extracted.

The decision whether to emit @person or @company is based on
whether the "name" fields of the person is defined.

	@person{key, % of the form contact:<palm record id>
	  name = "Full name(s), in standard BibTeX format", % name, firstName father
	  nickname = "Nickname(s)",
	  birthday = "Birthday date(s), in numeric 'day month' format", % from birthday
	  birthyear = "Birth year(s)", % from birthday

	  p.street = "Street of private residence", % address
	  p.city = "City of private residence", % city
	  p.zip = "ZIP code of private residence", % zipCode
	  p.state = "State of private residence", % state
	  p.country = "Country of private residence", % country
	  p.phone = "Private phone number", % "Home" phone[1-5] fields
	  p.cellular = "Private mobile phone number", % "Mobile" and "Pager" phone[1-5] fields
	  p.fax = "Private fax number", % "Fax" phone[1-5] fields
	  p.email = "Private e-mail address", % "E-mail" phone[1-5] fields
	  p.url = "Private home page", % url
	  p.account = "Private bank account",

	  r.street = "Street of alternative residence",
	  r.city = "City of alternative residence",
	  r.zip = "ZIP code of alternative residence",
	  r.state = "State of alternative residence",
	  r.country = "Country of alternative residence",
	  r.phone = "Alternative phone number", % "Other" phone[1-5] fields
	  r.cellular = "Alternative mobile phone number",
	  r.fax = "Alternative fax number",
	  r.email = "Alternative e-mail address",
	  r.url = "Alternative home page",
	  r.account = "Alternative bank account",

	  w.name = "Work organization name", % company
	  w.title = "Job title", % title
	  w.street = "Street of work organization",
	  w.city = "City of work organization",
	  w.zip = "ZIP code of work organization",
	  w.state = "State of work organization",
	  w.country = "Country of work organization",
	  w.phone = "Work phone number",  % "Work" and "Main" phone[1-5] fields
	  w.cellular = "Work mobile phone number",
	  w.fax = "Work fax number",
	  w.email = "Work e-mail address",
	  w.url = "Work home page",
	  w.account = "Work bank account",

	  note = "Additional notes about the person", % unmapped+note
	}

For a C<@company> entry the phone fields are concatenated together from the
given C<[rpw].I<...>> alternatives.

	@company{key, % of the form contact:<palm record id>
	  name = "Company name", % company
	  street = "Company street", % address
	  city = "Company city", % city
	  zip = "Company ZIP code", % zipCode
	  state = "Company state", % state
	  country = "Company country", % country
	  phone = "Company phone number",
	  cellular = "Company mobile phone number",
	  fax = "Company fax number",
	  email = "Company e-mail address",
	  url = "Company home page",
	  account = "Company bank account",
	  note = "Additional notes about the company", % unmapped+note
	}

=cut

	{
		my $name = '';
		if ($LAST_FIRST) {
			$name .= $f->{name} if defined ($f->{name});
			$name .= ','
				if $name && defined ($f->{firstName}) 
					|| $FATHER_NAME && defined ($f->{$FATHER_NAME}); 
			$name .= ' ' . $f->{firstName}
				if defined ($f->{firstName});
			$name .= ' ' . $f->{$FATHER_NAME}
				if $FATHER_NAME && defined ($f->{$FATHER_NAME});
		}
		else {
			$name = join (' ',
				grep { $_ } # suppress the empty results
					map { $f->{$_} if defined ($f->{$_}) }
						('firstName', $FATHER_NAME, 'name')
			);
		}
		if ($name && $SPOUSE && defined $f->{$SPOUSE}) {
			$name .= " and $f->{$SPOUSE}";
		}

		my $birthday = '';
		if (defined $f->{$BIRTHDAY}) {
			my ($year, $month, $day, $hour, $min, $sec, $tz) 
				= HTTP::Date::parse_date($f->{$BIRTHDAY});
			if ($year) {
				$year += BUG_2000 if ($year < 100);
				$birthday .= <<"EOF";
	birthyear = "$year",
EOF
			}
			if ($day && $month) {
				map { s/^(\d)$/0$1/; } ($day, $month);
				$birthday .= <<"EOF";
	birthday = "$day $month",
EOF
			}
		}

		# for each phone label, join with PHONE_SEP all such phones
		# in the order of their appearance
		my %phones = ();
		my $ph_labels = $record->{phoneLabel};
		for (my ($i, $ph_i) = (1, "phone1"); $i <= 5; $i++, $ph_i++) {
			next unless defined($f->{$ph_i});
			my $label = @Palm::Address::phoneLabels[$ph_labels->{$ph_i}];
			my $field = $BIBFIELD_BY_PADDLABEL{$label};
			unless ($name) {
				$field =~ s/^[wrp]\.//g;
			}
			my $ph = $f->{$ph_i};
			$ph .= " ($label)" if $AMBIGUOUS{$label};
			if (defined $phones{ $field }) {
				$phones{ $field } .= PHONE_SEP . $ph;
			}
			else {
				$phones{ $field }= $ph;
			}
		}

		sub unescape ($) { # unescape back URLs and E-mails
			$_[0] =~ s"\\newline"\n"g;
			$_[0] =~ s/\\//g;
		}

		my ($phones, $k, $v) = ('');
	  	while (($k,$v) = each %phones) {
			if ($k =~ /email$/) {
				unescape($v);
			}
			$phones .=
				"\n\t$k = \"$v\",\n";
	   	}
		unescape $f->{$URL} if ($URL && defined $f->{$URL});


		# Palm fields not mapped into bibtex fields should be appended to the note
	my $note = 
			join("\\newline\n",

			grep { $_ } # suppress the empty results

			"{\\bf Category: } $categories->[$record->{category}]->{name}",
			map { "{\\bf $pdb->{appinfo}{fieldLabels}{$_}: } $f->{$_}" if defined ($f->{$_}) }
				@unmapped_custom
			);
		$note .= "\\newline\n$f->{note}" if defined $f->{note} && $f->{note};

		no warnings 'uninitialized';
		if ($name) {
			print <<"EOF";
\@Person{contact:$record->{id}, $birthday $phones
	name	= "$name",
	p.street	= "$f->{address}",
	p.city	= "$f->{city}",
	p.state	= "$f->{state}",
	p.country	= "$f->{country}",
	p.zip		= "$f->{zipCode}",
	p.url	= "$f->{$URL}",
	w.name	= "$f->{company}",
	w.title	= "$f->{title}",
	note	= "$note",
}
EOF
		}
		else {
			print <<"EOF";
\@Company{contact:$record->{id}, $phones
	name	= "$f->{company}",
	url	= "$f->{$URL}",
	title	= "$f->{title}",
	street	= "$f->{address}",
	city	= "$f->{city}",
	state	= "$f->{state}",
	country	= "$f->{country}",
	zip		= "$f->{zipCode}",
	note	= "$note",
}
EOF
		}
	}
}

=head1 INSTALLATION

Download the prerequisite Perl modules for this script from CPAN,
and then you'll be able to run the present script as well.

Install F<directory> from CTAN, unless you already have it installed
as a part of your TeX distribution.

=head1 EXAMPLE

Fetch the F<AddressDB.pdb> from your pilot, for example, by using pilot-xfer(1).
Create a file F<all.tex> as the master document to format your address book,
e.g.:

	\documentclass[a4paper,twocolumn]{article} % vim:fileencoding=cp1251
	\usepackage[T2A]{fontenc}
	\usepackage[cp1251]{inputenc}
	\usepackage[obeyspaces]{url} % so "Foo Bar <foo@bar.org>" appears with the spacing inside!
	\usepackage{fullpage}
	\usepackage[longdates]{directory}
	\raggedbottom\raggedright
	\renewcommand{\dirand}{$\heartsuit$}
	\pagestyle{empty}
	\directorystyle{address}
	\begin{document}
	\nodir{*}
	\directory{all}

	\today
	\end{document}

Then convert the PDB to the bibtex database, and format the document:

	$ paddpdb2bib --encfrom=cp1251 \
		--url=custom1 --birthday=custom2 --father=custom3 --spouse=custom4 \
		AddressDB.pdb > all.bib
	$ texi2dvi all.tex
	$ bibtex all.aux
	$ texi2dvi all.tex

=head1 HISTORY

	$Log: paddpdb2bib,v $
	Revision 1.7  2007/02/05 17:20:31  vassilii
	pod docs only:
	1) eliminate from BUGS the one about email typesetting,
	and add to the EXAMPLE a corresponding incantation for package {url}
	2) SEE ALSO - hyperlinking

	Revision 1.6  2007/01/15 16:52:26  vassilii
	In the Last, First format don't start with the comma
	if there is no Last name. The new scheme is the best default
	w.r.t. the sorting of non-latin contacts. Switched the default,
	the cmdline, and the pod docs accordingly. Finally, no sorting
	glitches in my versatile phonebook!

	Revision 1.5  2007/01/12 18:35:48  vassilii
	BUG fixed in the birthday sorting order - now 0-padding the numbers to 2 digits
	the output now includes a descriptive header in a comment format
	documented --version option

	Revision 1.4  2007/01/12 13:10:28  vassilii
	added --version cmdline option
	restored the spouse to work in the First Last ordering as well

	Revision 1.3  2007/01/12 03:33:29  vassilii
	RCS kwds in the pod

	only the "guest" ambiguous phones are now labeled 
		(like (Pager) in the mobile fields)
	workaround posted for the directory \dircheck bug
	pod updates

=head1 BUGS

With non-Latin encodings, bibtex(1) is pretty freaky. You might
need to procure a custom patched bibtex version that works with
your national encoding.

For some reason, the F<directory> package, coupled with LaTeX
and Bibtex on my system fails to properly sort the produced entries
when I use the I<koi8-r> encoding (sorting them in the YU A B C D
order, i.e., according to the Latin alphabet), so for a
decent sorting of Russian one has to use the I<cp1251> code page.

The C<\dircheck> macro only calls C<\Dirheader> for the first Russian letter
present in the sorted output, and never works for the subsequent ones.
This is a bug in the F<directory> package, and a patch against its
version 1.20 has been submitted upstream. You can download the
same patch separately from F<http://www.tarunz.org/~vassilii/pub/dircheck.diff>.

Some escaping debugging might be in order, as well as general code
beautification.

More intelligent guessing of ambiguous mapping between the PDB and bibtex
fields might be in order, maybe based on category and which fields are present.

Adding to the note field could also be made configurable on a per-field basis,
as well as overflowing extra phones of the same type to the note.

The birthday dates are passed using HTTP::Date internal heuristics. 
Some more generic ways could be used, such as using the on-palm format for dates,
or unleashing various DateTime::Format beasts. (bork! bork!)

Maybe per-category export?

=head1 SEE ALSO

CTAN entry for the F<directory> package,
L<Palm::Address(3)|Palm::Address>, 
L<bibtex(1)>, 
L<texi2dvi(1)>, 
L<HTTP::Date(3)|HTTP::Date>

=head1 AUTHOR

Vassilii Khachaturov <F<vassilii@tarunz.org>>

=head1 LICENSE

This program is free software; you can redistribute it and/or 
modify it under the same terms as Perl itself.

See F<http://www.perl.com/perl/misc/Artistic.html>
