[OVC-demo-team] BRP fixes and status report

From: Eron Lloyd <elloyd_at_lancaster_dot_lib_dot_pa_dot_us>
Date: Wed Apr 07 2004 - 17:24:05 CDT

Hi all,

Sorry about the long delay in responding--I was out of town through the
weekend and came out of Sunday feeling ill (which I'm only now starting to
recover). I'm glad to see discussion on the BRP, and welcome the expert
assistance from David and Karl. This is my first major piece of code as a
developer; my background experience is mainly Web (Zope, more specifically).
While I've been studying Python for several years, nothing teaches quite like
a real project (with real deadlines). I'm sure my code isn't the best out
there, but it was the best I could do since I came into the project so late
and only had roughly two weeks to produce a working solution.

I know I was told to let Karl and David work on fixing the code and focus
solely on the literature, but I just couldn't accept letting these mistakes
defeat me and possibly reduce the confidence of the team in my ability. So, I
did both. Attached is quite a few things. First is a new draft of the BRP
full-page guide. I've incorporated as much feedback as possible while within
the window of my "marketing mode" (which hasn't been long due to feeling
ill). While making the changes, I thought of a few unaddressed questions:

1. Is the 1 operator/2 witnesses thing just an arbitrary number? What about
for polling stations that experience a large volume of turn-out? Could we
possibly chain together several BRP stations and have more people involved in
the hand-counting and ballot-scanning? (This is a long term thought. My
initial response is to have more polling places, possible due to a lower cost
of equipment). I know the Committee of Seventy had that comment about our
approach.

2. Is the general language of the medium used to store EBIs on vote stations
OK?

3. Just want to make sure that I'm correct in assuming that a) test ballots
are matched, but not tabulated, and b) spoiled ballots are neither matched
nor tabulated (obviously).

4. Should there be a way to have the vote stations mark spoiled ballots so
none slip through? This could be marked as a attribute in the EBI and a
special marker on the paper ballot.

Please review the attached draft and let me know of any feedback ASAP. This
includes layout, page format, etc. Two things to take note are that the
quality of the images are poor because I exported it low quality to save
bandwidth and it will be better for the final copy and none of the text is
hyphenated during the proofing.

On to the BRP code... Attached is also a copy of my latest work. I have tried
to address every issue we've experienced, but obviously will need others to
look over it all (hopefully many others this time!). The problems we had that
kept us up to the wee hours of the morning before the demo have been tracked
down to several factors:

1. Scanning trouble. That wonderful IndexError problem when scanning in the
barcodes was due to a couple of changes in convert.py which have since been
discovered and resolved. The API calls returned empty votes which obviously
wouldn't be useful and caused major problems in producing the REBIs (which
are needed to do anything else). The copy I had was unchanged and not from
the CVS version so worked without a hitch. Each one of us thought the other
was hallucinating from lack of sleep when mine would work and their
wouldn't ;-). This has been resolved both in the API and in my code. I do
recommend that the API do something more useful than return an empty list. I
patched it to return a ValueError exception instead, for example.

2. Another big problem that lead to instability was the test suite used to
test the BRP. Because a canonical collection of ballots had never been
created, I threw what I thought was a good set of ballots generated by a
simple script by one of the other team members together and generated a PS
file for each one to scan in. While they did work in most instances during
development, closer to the deadline the filename format straying from the
spec caused some confusion in transferring and comparing EBIs. I patched the
script that generates the ballots to omit the serial number and it seemed to
do the trick.

3. This was a sneaky one: A major intersection in the code where accidents
could occur is the components that serialize an EBI from a barcode. This
mistake was purely mine; when passing the parameters to getxml.ballotxml(), I
just assumed the date arg would be written as an attribute in the XML file.
It took me a little while to realize that it gets split up in the function!
No wonder the ballots weren't matching, producing file errors and no
tabulation (at least we know comparing xml_objectify objects works great)!
Now, I use a Python datetime object and .strftime() it appropriately in
different points in the code.

Those were the major show-stoppers. That doesn't mean that there aren't others
or that I've fixed everything correctly. But now we know what was happening
at least. Besides stomping out those problems, I've integrated several other
requests/changes:

1. At Alan and Fred's request, the application now BEEPS every time a barcode
is scanned in properly.

2. The application now implements the BRP spec to the T. Scanned barcodes
produce "b-" files, vote station files contain "v-", and verified files strip
the prefix entirely. The directories "scanned/", "stored/", and "verified/"
have been renamed to "barcodedata/", "votingmachinedata/", and "results/",
respectively. I kind of liked the old naming, but if you want the spec,
you've got the spec.

3. A final issue directly related to the code is the display of a "orphaned"
or "missing" ballot total, kind of an "extension" of the spec. I thought this
was a useful calculation and included it, mainly for debugging, but perhaps
has additional value. I basically boils down to this formula:

i = Invalid Ballots (Test Ballots + Spoiled Ballots)
d = Voting Machine Ballots
p = Paper Ballots
m = Manual Ballot Count

abs(((d - i) - (m - i)) - ((d - i) - (p - i)))

Basically it ensures that the amount specified using all totals matches, and
can be used to detect missing ballots either not registered on the voting
machines (EBIs) or unincluded or deleted from the REBIs. I guess it differs
from a match/unmatch flag in that it means a ballot is actually NOT THERE as
opposed to there and containing different values. This can easily be removed
if you'd like.

And now, for the discussion of platforms, packaging, and library versions.
Following the conversation these past few days (both on the list and off),
several comments came up about the way the BRP system is implemented. Some
points were definitely on point. Others I would like to have a chance to
address. Please don't take this as coming off defensive or otherwise; I'd
simply like to hammer out these issues some more.

1. Regarding using bleeding edge stuff: I have to ask what the concern is
here, especially looking at the actual releases I require to run the system.
Besides SIP (which although an RC version is stable as the 3.x series), every
requirement for the code is using the latest *stable* version. Like most
things in the free software community, new releases come out on an almost
daily basis, and is simply the cultural norm. For the stuff I use as dev
tools, I find it odd not to want to get the latest features and fixes,
ensuring that coding and the actual end product avoid known, preventable
problems and can take advantage of new additions. Python 2.3 is much more
than a bug-fix dot release. In addition to a 20-30% run-time speedup (which
is worth it in itself!), there are many important features, such as true
boolean types, datetime module, fixes to garbage collection and weak
references (which PyQt does a LOT of), major distutils enhancements, UNICODE
source encoding, and many others. Qt 3.3 contains an equivalent amount of
major updates, all of which are critical to productive development. PyQt/SIP
follow as well. Using the distro packages as the platform for this code is a
bit unreasonable; Fedora Core 1 and SUSE 9 are both over 6 months old. What
can be done is to streamline the packaging; there are ways to build PyQt/SIP
into the Python interpreter and ship with a static Qt build. Also, new
releases of both SUSE and Fedora are right around the corner, and will be
offering equivalent if not newer versions of the dependencies anyways.

2. On platform issues, why should we even be talking about Mac or Windows
(right now, especially) at all? If we hope to develop an end-to-end open
system, then that's what we need to do. In looking at other toolkits/APIs for
the OVC applications, you won't find a better, more complete (and
cross-platform ready) system than Qt. AnyGUI is more of a least common
denominator, which isn't much, and wxWindows is more of a hack than PyQt
(GTK+ (C) > wxWindows (C++) > wxPython), and requires many more dependencies
(GTK+ especially). If we need to look at porting various applications to
Windows or Mac, let's cross that bridge when we get there. Qt is GPL'd on
Mac, and probably will be on Win32 shortly. If we need commercial licenses,
then we'll get them and pass the cost on to the customer. Are we against
supporting good companies? I'll stop evangelizing now, but I really hope you
consider this all. As for now, none of the EVM components require anything
but Linux, so let's stick to that. The only thing I'm wondering about is if
we should move to a compiled language (C++), so we can simply hand someone a
static binary that includes everything necessary.

3. It is very hard to produce this as simply "throw-away" code as expected,
although I do understand that is what it is for now. At the same time, can we
please assert that this is NOT going to be thought of as code that will be
used for certification and think of ways to refactor it as such? One of the
reasons wizard.py is a big conglomerate is that otherwise it would quickly
turn into a full-fledged Python package, something I was told early-on to
avoid for right now. Sure I would like to factor out 90% of the stuff in
there, but we don't have decisions on any of that and it's not that important
for the demo stuff. What is important is how we package, share and launch the
existing code. The BRP isn't really integrated into the rest of the codebase,
and I don't know if it needs to be. At the same time, it *will* need to know
where to find the shared modules it uses, be they installed
in /usr/lib/python/site-packages, usr/local/... (my suggestion), a user
directory using .pth file (like the demo machine now), or whatever.

Alright enough talk, here is the work: also attached you will find several
other things. A new wizard.py module, incorporating all the fixes mentioned
above. With it is a canonical suite of test ballots, their inventory file,
and a list of their barcodes which can be used to input if you don't have a
scanner. Unpack votemachinedata.tar.gz in your BRP "working directory" to
gain access to this data. If you'd like, please burn the vs-0* directories
each to a CD and test the loading/unloading (just burn the contents to the
root of the disc). Make sure you have an acl.txt file in the working
directory to authenticate against. I've included a sample one to use.
Regarding the installation of wizard.py so that it can run, there are
half-a-dozen ways to do it. The simplest is to call "$ python wizard.py" from
within the working directory, ensuring that the dependencies (evm2003,
gnosis.xml, Qt, PyQt, SIP) are all resolvable (including the python
interpreter--2.3.X is required). I'm working to further document that process
with an updated README within the next day or so.

Thanks again for all the feedback and help in moving forward. I hope that
everyone is still enjoying this project even now when things are becoming
more stressful, demanding and exact.

To health!

Eron

---
[This E-mail scanned for viruses by Declude Virus]
==================================================================
= The content of this message, with the exception of any external 
= quotations under fair use, are released to the Public Domain    
==================================================================
Received on Sat Nov 22 03:47:09 2008

This archive was generated by hypermail 2.1.8 : Sat Nov 22 2008 - 03:47:30 CST