Friday, December 31, 2010

the plan comes together...

The strength and flexibility of open source tools is staggering!

Gentle Reader, I receive raw binary data by the gigabyte.  I've gotten data from outside my company, with a request for help crunching it.  But this isn't my day job and my management chain doesn't support me.

No matter.  I'm a believer that the "simple model/lots of data" method is one good tool for problem solving.  If nothing else, the challenge of do Lots with Nothing is a lot of fun!  And, after some effort, I think I have a fairly solid solution.  no Matlab seat and no more Random Access crashing.

The chart below shows my process.  Each of these chunks works, beginning with raw binary .MDF files, through the waltz of perl, perl, Octave, SQL, TeX and a small C utility (thanks again, Tim!  much better than my perl script).  At the end I get slick  looking  .PDFs and a couple thousand rows in my database!

I figure the three or four people out there who are interested in this kind of stuff are either now wiping the drool off their chins, or laughing at my totally hacked together solution.  Feel free to laugh it up but keep in mind this is MY blog, scoffers!


and Happy New Year to you!

Screen shots of various chunks.  Octave system calls summon the MDF shredder, perl and MikTeX's pdflatex so those parts don't get their own window:



- nzvyyx

22JAN2011 edit:  Tim the Mighty pointed out I should be using his .OCT file instead of calling the .EXE.  Good point Tim!  I had been using a perl script and I had no choice; unfortunately I didn't re-evaluate that process when I started using his SignalExtractor.  Now data pipes directly to variables instead of requiring a lengthy csvread() call.  Just goes to show you how a mental framework can steer one's judgement.  

Thursday, December 30, 2010

Leap to \LaTeX\

I don't write reports.  I train my laptop to write them for me.

The trusty data warehouse is doing its thing, and the robots are tirelessly generating data... time to revisit automated data reduction.  A couple of years ago I hacked together some perl and Octave to produce HTML/CSS reports... an ok solution but not ideal because HTML just doesn't match up..

Continuing my "right tool for the right job" mantra, here's a quick-n-dirty-n-kinda stupid Taguchi matrix comparing the suitability of some popular file formats for automated reporting.
So according to this, .TEX is the best format for my needs (and HTML that I've been using for years was the worst choice.  Woops!)  The short form on this decision is:
  • on my end (the source) it supports a variety of tools and has a stable, flexible format
  • for everybody else (aka the SINKS) .PDF is a polished product
There was just this one little problem, minor thing... but... you see... I know squat, about  \TeX\.  But after a day of poking around and making mistakes, I'm ready.  And suddenly the idea of merging thousands of data plots from two different reports, well, it makes me giggle.  Bring it on, Robots!

- nzvyyx

ps A big THAN-Q to ASME for their asme2e document class, and to TexMaker and MikTeX for doing a great job making PDFs

Wednesday, December 29, 2010

Free Range Office Apps

I think I've stabilized my data warehouse.  On the plus side I'm using standard tools with specified behaviors.  The not-quite-as-happy news is that at least one of the tools currently has issues:  OpenOffice.org.

OOo shows up because I've connected OOo's Base as the client to a MySQL server (yes, I am using a single laptop as both a client and a server).  OOo's Base seemed like a good option for dealing with manual data entry and generating reports, and via the excellent ODBC connector extension it was stupidly simple to get Base talking to MySQL.  MySQL seems rock solid, and the simple SQL i/o I needed had a short learning curve.  SQL scripts aren't too bad either. Oh and MySQL_Workbench gets honorable mention for initial setup.

However!  OpenOffice, with no extra effort on my part, appears to be leaking memory.  It's easy enough to shut it down and restart occasionally but OOo is hardly the M$Orifice Killer that the Linux fanboys would have me believe.  Here I've doubled its real memory usage just by opening and closing some stuff.  I've seen 500Mb, worst case.  Not enough of a hassle force me back to "Random" Access, but troublesome.  More to come.



- nzvyyx

Monday, December 27, 2010

We both loved Powerpoint so I broke up with Windows

My I.T. department recently announced we'd migrate to Windows 7.  At first I thought this was a  break point in my work computing life, but upon reflection me & MicroSoft have been drifting apart for some time.  MicroSoft stuff is great for casual or even high end business world "power users."  Powerpoint is an excellent application, for example.  Arguably it's the best thing MicroSoft offers, but it was a love triangle with Powerpoint that triggered my breakup with MicroSoft.

My management requires Powerpoint slides for change review & sign-off.  As a result the engineers were using Powerpoint for everything - including a lot of stuff management didn't want to see.  Powerpoint is excellent at creating high level summaries for management.  It's an ok tool for say, a user guide.  It is completely miserable for detailed engineering reports.  

Everything was showing up in Powerpoint.  We'd open our fairly competent data viewer app to create screen caps for Powerpoint slides.  We'd screen cap out of our excellent code compare app into Powerpoint slides.  Somehow the required management summary had turned into a default file format spec of .PPT!  Then, as a bonus - we'd complain about how lousy Powerpoint was as a technical writing tool... o wait... I'm still complaining...

The concept of tool diversity was my tipping point.  My monogamous relationship with MicroSoft had bothered me, but not enough.  Watching I.T. write checks while friends lost their jobs was a concern, but wasn't enough.  The tipping point was the realization that a single good multi-tool can't beat a variety of really excellent tools.

So, while real power users may scoff at my non-L33T sKillZ, here's some highlights of my breakup
  • Access!  How you tease!  As my data built up over years, data corruption became intolerable.  SQL looked like a good option, and in hindsight I can say everything else is a waste of time. 
  • the WinXP interpreter suite is... well... missing.  Where are sed, perl, python, Octave, GnuPlot, etc?  I'm an engineer, I do calculations and manipulate data.  Excel?  Sure... but... seriously? 
  • if Octave or perl aren't enough then typically I need a C compiler... bcc, Visual studio? ... or Cygwin & gcc.  Helloooo, POSIX compliance! 
  • My files have to survive laptop upgrades.  The painful jump to Office'07 left me thinking my best option for document life included open file specs and as much plain text as I can get.
    In hindsight that last point seems like a "duh!" moment.  I suspect my company should agree, and if the corporate entity could converse I'd love to banter with it:

    Me:  "Hey Corporation, how ya' doin?  I have a question: you use process spec as a means of independence from individual talent, that much I understand.  But your process documents have a proprietary format, which is owned by a vendor with a demonstrated interest in periodically invalidating their own formats.  Shouldn't we enforce a stable, specified process document format?"

    Corporation:  "Back to work, Slave!"

     - nzvyyx