home | services | products | resources | forum
about us | contact   
May 23rd, 2018

About Us

Particle Revelation
Hardware Destruction
Humor [alpha]
Murphy's Laws

Java Data Structures
C# Data Structures
Database Design
Graphics Tutorial
Artificial Intelligence

Graphics Tutorials
Hacking Tutorials
Java Applets
MIDI Music
Gov & Misc Docs

Chess Game

pWobble 3D
Machine Learning

CISC 7510X/7512X
CISC 7334X
OS (old)
AI (old)
App Dev (old)
C++ (old)
OOP (old)
Web (old)
Perl (old)
DBMS (old)
Perl (old)
ProgLangs (old)
PHP (old)
MltMedia (old)
Oracle (old)

Privacy Policy

Welcome to www.theparticle.com. It's the newest pre-IPO dot bomb that's taking the world by storm. Now is a perfect time to buy lots of worthless and overpriced shares!
     What this site is about?

Internet is becoming more and more polluted with junk-mail, people selling crap, and businesses which don't know their place on the net. They're all trying to make this wonderful place (i.e.: the net) in to hell (i.e.: real world). Internet should be viewed as a place of imagination, creativity, and most of all: fun. Internet is not some really advanced tool for searching for people to rip-off. It's about searching, and finding, things which are useful, helpful, and promote the sharing of ideas. This is what this site is striving to become.

News, Updates, & Rants...

     May 14th, 2018

Finished reading AWS: Security Best Practices on AWS: Learn to secure your data, servers, and applications with AWS by Albert Anthony. This is essentially a list of AWS tech that are key to securing data on AWS. Every corp planning to move to AWS should do this stuff.

- Alex; 20180514
May 14th at wikipedia...

     May 11th, 2018

Finished reading HBase Design Patterns by Mark Kerzner, Sujee Maniyam. Pretty good non-very-technical book. Doesn't have anything too new (HBase: The Definitive Guide is way better).

- Alex; 20180511

     May 9th, 2018

Windows Notepad fixed after 33 years: Now it finally handles Unix, Mac OS line endings. Wow! Though I hardly use Windows, this does annoy me often enough for me to get excited about it.

- Alex; 20180509

     May 7th, 2018

So on March 20th, 2018, I speculated about the cause of the Uber self-driving car hitting a pedestrian... and now it turns out I was right: Uber crash reportedly caused by software that ignored objects in road.

In other words, there *are* things on the road that the car is *designed* to ignore.

- Alex; 20180507

     May 5th, 2018

Spent the morning watching the Berkshire Hathaway annual meeting.

Kiddo is 6-months today :-)

In other news, the reinforcement (or boosting) logic on top of the n-tuple classifier: Apparently the performance curve flattens out... Running on EMNIST, after 100 training iterations, for n=10, using 32 tables gets us 92% accuracy, using 64 tables gets us 94% accuracy, using 128 tables gets us 94.6% accuracy, and using 256 tables gets us 95% accuracy. And then it just flat lines. I'm sure the code can be pushed to get 96% accuracy, perhaps by using bigger n, or using more tables, but it seems it's just not worth it.

So back to the idea land...

- Alex; 20180505

     May 4th, 2018

Finished reading The Structure of Scientific Revolutions by Thomas S. Kuhn. Pretty neat book. Essentially a bit of change in perspective on the clarity and correctness of science. Text books record only theories that survived the test of time, giving an impression that these theories and ideas behind them had a very structured progression. Reality is often much more disjointed and unpredictable than that.

- Alex; 20180504

     April 30th, 2018

Starting new job today :-)

- Alex; 20180430

     April 29th, 2018

Implemented a reinforcement (or boosting) logic on top of the n-tuple classifier that recognizes EMNIST digits. After about 100 iterations of that, the code went from ~80% accuracy to 95%!

- Alex; 20180429

     April 28th, 2018

Scientists Have Confirmed a New DNA Structure Inside Human Cells. Awesome stuff.

North Korea will close main nuclear test site in May, South says. Wow. The impossible is actually happening!

- Alex; 20180428

     April 26th, 2018


- Alex; 20180426

     April 23rd, 2018

Why everyone is stressing about the 10-year Treasury yield. Uh, oh.

In other news, finally heard back from $CORP_NAME. The background check is mostly done, except for a few things that they're willing to do after I join. So they've asked me for the start date.

Coincidentally, today is exactly two months since my last day at FINRA :-)

As much as I like relaxing, I'll probably start next week.

- Alex; 20180423

     April 17th, 2018

So apparently my old ZenPad, that `broke' almost a year ago (the wifi wasn't coming up---no matter how many resets I've done, etc.; I thought it was a hardware issue---wifi chip burned out, or something) suddenly came back to live and started working again (wifi and everything).

In other news, discovered that my calendar was wrong. One of my classes actually starts 10 minutes earlier (6:05pm) than is marked in my calendar (6:15pm). So I've been showing up late most of the semester! This is pretty embarrassing :-/

UPDATE: So to add to my embarrassment, this time, I showed up on time... but to the wrong classroom. 232NE vs 234NE. They're nearly identical, and I really didn't pay attention which one I walked into. Strangely enough, some students actually showed up in the wrong classroom---so for about 20 or so minutes we were wondering what happened to the rest of the class :-/

- Alex; 20180417

     April 16th, 2018

Yey, kiddo had his first meal today: baby oatmeal!

- Alex; 20180416

     April 4th, 2018

Urgh. Apparently Yahoo! Finance API is gone. Didn't notice it until today :-/

- Alex; 20180404

     April 2nd, 2018

This doesn't happen every day: an actual snowy day in NYC... in April!

- Alex; 20180402

     March 28th, 2018

Visited two Toys R Us today.

In other news, accepted $CORP_NAME offer.

- Alex; 20180328

     March 27th, 2018

Ok, I think I might need to create some exams soon :-/

Heard back from $SOCIAL_MEDIA_CORP. They have made a ``difficult decision not to move forward at this time.'' So... amm... The Paradox of Choice: I'm pretty relieved. I've given them a try, a very fair chance to hire me, and they chose not to, so... yey!

Guess it would've been weird of them to hire someone whose never had (and still doesn't have) a $SOCIEAL_MEDIA_CORP account :-)

- Alex; 20180327

     March 26th, 2018

Having HSBC Quant dinner event. Rhydian Cox, HSBC Chief Risk Officer did audiance polling on stuff like when was HSBC founded, how many countries they operate in, which country had the first HSBC ATM machine, etc., and all sorts of similar trivia questions. Winners got HSBC branded insulated bottles. I didn't win anything. The dinner was grass with mayonnaise. And beer. Lots of beer. (and wine). All in all, a pretty useless event.

In other news, $CORP_NAME came back with an offer.

- Alex; 20180326

     March 25th, 2018

Changed oil in my 4runner.

- Alex; 20180325

     March 24th, 2018

Kiddo said "ma-ma". For first time. He was just moving lips up and down, and saying "aaa..." sound, and it came out as a pretty clear "ma-ma" sound.

- Alex; 20180324

     March 23rd, 2018

Got kiddo a stroller. Toys R us is going out of business, and had a 20% off sale on a stroller we wanted, so we just bought it... without doing any research. It looked good, and sturdy, etc. But... it's VERY heavy. I should have thought of that as I lifted the box in the store. Anyways, Suneli can't lift it---it's very easy to push around, big sturdy wheels, but getting it in and out of the car is gonna be a major pain. We'll likely need a lighter stroller just for car-trips.

Had an in-person $SOCIAL_MEDIA_CORP interview. They did career discussion, two design hours, two coding hours, and one lunch hour.

Coding 1, they asked to implement string.h function:
char *strstr(const char *haystack, const char *needle)
which surprisingly to me, was a challenge. It *looks* easy, but actually doing it, if you haven't done it in forever (like in 20 years), isn't trivial. I think I got code that worked---very basic loop through haystack, and then another loop that goes through needle. Very inefficient, but eh.

Second coding question was: this is all 2D: there's a ship in a tunnel with circular obstacles. Height of tunnel is H (so from 0 to H in y axis), and each obstacle is some object that has {x,y,rad}. You're given input of H and a list of obstacles, and your function has to return true or false whether the ship can make it through the tunnel.

If there are no obstacles, the ship makes it through. If there's 1 obstacle that blocks the whole tunnel, obviously the ship doesn't make it though. If there are two obsacles placed such that there's a tiny hole in-between, then the ship makes it though. But if those two obstacles are really next to each other, they could bock the tunnel. You could have dozens of obstacles forming a barrier for the ship.

My first reaction to this question was WTF!? This is a 45 minute programming assignment???

My second reaction, there's gotta be a trick to it. This should be easy to program. Obviously you can't compare ship to each obstacles, since obstacles can combine to block passage (while individual obstacles may not block tunnel).

Ok, so how to you combine obstacles? If they're touching of course. So there should be some way to glue touching obstacles into.... COMPONENTS. Connected components! Graph theory pays off!

So the solution I came up with is to use depth-first-search on obstacles, to build connected components (using distance to determine if they're conected---distance is less than or equal to sum of radisuses). Once you have components, for the entire component, if the maximum y plus radius is above the tunnel (greater than H), AND minimum y minus radius is below the tunnel (less than 0), and that component blocks the tunnel. Finished coding just in time.

I'm very proud of figuring out this one on the fly. It's a damn hard question.

Then there were two architecture questions. One related to notifications (e.g. push vs pull notification, and how they'd scale... e.g. pull doesn't need the server to maintain state, push requires server to keep list of clients, and can't be scaled as easily), etc.

The other architecture question had to do with designing priority metric for a search---you type in something into a search box, and a list of a dozen or so things show up that you might be interested in. How do we get that dozen that you might be interested in. After a bit of discussion, I proposed an architecture where we pull top N (perhaps 10000) records that start with the search prefix string, and then rank those by the user's profile preferences.

So there would be two rankings, one is some global ranking (applies to everyone), and within the top N of that global ranking, what you'd personally see would be determined by your personal rankings. Then questions went into the direction of how to determine global rankings (by how many people visit/watch/like, etc.) and personal rankings (how long you've stayed at a site, what kind of content you generally consume, etc., like a personalized feature vector, etc.)

All in all, the architecture questions were very hand-wavy and fuzzy. But I guess that was their purpose.

Lunch was great.

The final coding question is the one I got stumped on. Given a string of digits, such as "23562" insert any number of "+" or "-" to have the outcome be some N, e.g. "23+5-6+2" = 24. Write a function that outputs all the possible strings that sum to N.

This puzzle seems easy---and I thought of some sort of recursion---at every digit, you can either insert nothing, insert a "+" or insert a "-", and you need to explore all the subtrees of those choices. But no matter what I tried to setup, just couldn't get the right iteration going... and 45 minutes ran out :-/

UPDATE, sat down on the weekend and it wasn't really as complicated as it seemed. Non-recursive 17 lines of Perl code. In about 20 minutes (yes, I've been thinking about it for the last few days). So yeah, it was possible to do it during interview:

$ perl fbdigit.pl 123456 100
$ perl fbdigit.pl 1234567 100
+1+2+34+56+7 = 100
+1+23+4+5+67 = 100
$ perl fbdigit.pl 12345678 100
+1-2-3+45+67-8 = 100
+1-2+34-5-6+78 = 100
+1+23-4+5+67+8 = 100
+12+3-4+5+6+78 = 100
+12+34-5+67-8 = 100
$ perl fbdigit.pl 123456789 100
-1+2-3+4+5+6+78+9 = 100
+1+2+3-4+5+6+78+9 = 100
+1+2+34-5+67-8+9 = 100
+1+23-4+5+6+78-9 = 100
+1+23-4+56+7+8+9 = 100
+12-3-4+5-6+7+89 = 100
+12+3-4+5+67+8+9 = 100
+12+3+4+5-6-7+89 = 100
+123-4-5-6-7+8-9 = 100
+123+4-5+67-89 = 100
+123-45-67+89 = 100
+123+45-67+8-9 = 100

- Alex; 20180323

     March 22nd, 2018

Apparently I'm still in the running for $CORP_NAME. It's just taking them a while to come up with an offer. Should hear back from them shotly.

- Alex; 20180322

     March 21st, 2018

Ok, so this blizzard started out slow, but then over the entire day, piled up quite a bit of snow. Too bad it will all melt within a day :-/

In other news, missed an email from $CORP_NAME HR, who wanted `to catch up'. Hmm. Will chat with them tomorrow morning.

- Alex; 20180321

     March 20th, 2018

Urgh, another blizzard.

How a Self-Driving Uber Killed a Pedestrian in Arizona, and Arizona police release video of fatal collision with Uber self-driving SUV. My outsider guestimate, it's a similar case to when Tesla hit a firetruck: it just wasn't expecting to see anyone walking across the road, so it ignored them---there are lots of stationary things on or near the road that the self-driving car must ignore (traffic signs, images on the road, garbage on the road, etc.). This wasn't at a cross walk, and seemed to be right in the middle of the road; not a place you generally see pedestrians---and frankly, a human driver would've hit that pedestrian too.

In other news, got an NDA email from $SOCIAL_MEDIA_CORP. Signed it. Essentially I can't use any proprietary stuff I learn from them during the interview, but whatever I reveal during the interview, they're free to take it and use it however they wish.

Had another long tech interview at $FIN_CORP. I'm almost tempted to join---they sure got some sharp folks there. The work they do is also kind of in-line with what I'd want to do.

- Alex; 20180320

     March 19th, 2018

Finally got my car `fixed'. Replaced passenger-side air-bag inflator... so now feel safer :-)

$SOCIAL_MEDIA_CORP called and scheduled an in-person interview this Friday. Their email contains a lot of preperation material. But... the recruiter says: "they can ask you anything", and then says "ok, it's best if you prepare"... for what can you prepare... if they can ask you about anything?

It's times like this, you either know stuff or you don't.

$SOCIAL_MEDIA_CORP has a huge graph of relationships. As well as directed graph of following relationships. So... my guess questions could be related to: friend recommendations (in real life, if you're friends with two people, perhaps they'll also wanna be friends with each other?). But I'm guessing they've solved this problem already. How does one pick those out from a graph? Other questions could be related to similarity scores, e.g. if folks are both following a particular celebrity, then they're in some sense equivalent---have similar interests. If one of them follows a trendy new singer, perhaps the other one will also want to follow that trendy new singer?

Articulation points and bridges? Those are individuals and relationships that could be key to connectedness of a graph. I'm just thinking what graph theory stuff could fit into their business model.

In still other news, $FIN_CORP setup a 2nd in-person interview for tomorrow---to go over the homework solution I'm guessing.

- Alex; 20180319

     March 18th, 2018

Work work work on the PhD day...

In other news, Suneli started driving school. Took the "5-hour" pre-driving class (which was closer to 2-hours than 5), and drove a car for... I'm guessing 30-40 minutes :-)

In other news, finished $FIN_CORP homework. Spark/Scala. Not too bad. Hopefully most of it works. The basic idea: you have a quote feed (in pipe delimited FIX-like format). Some quotes are from venues, and others are consolidated quotes. The venues could publish either bid, or ask side of the quote (separately). The consolidated quote is indicated by not having a venue code.

Task 1: for all venue quotes, create a list of filled-in quotes, (fill in the missing side). That's easily done by doing a "last" windowing function, with "true" as the 2nd parameter (that's the ignore nulls flag).

Task 2: a bit more complicated: for each consolidated quote, indicate the count of venues that are quoting that price. (e.g. for each bid, and ask, add a count of venues).

There are several ways of doing this one. I decided to save the last bid/ask price for each venue (in a map), and whenever I saw a consolidated quote, just go through all saved venues and count those values where consolidated price matches the venue price.

So the usual repartition then sortWithinPartitions, then mapPartitions, and use the Map within the iterator... output is all quotes, except consolidated quotes got two extra counters, one for bid and one for ask.

Filter to keep only consolidated quotes, and... done. Write out the output.

- Alex; 20180318

     March 16th, 2018

Had an in-person interivew with $FIN_CORP . Very small firm. 6-people total, I think. They manage assets (hedge fund, of sorts). They used to do HFT a while back, but now do day-to-day or week-to-week positions. Most of the technical questions weren't very interesting (e.g. databases, etc.)

Interesting questions came from their quant guy. He asked me differences between sample variance and population variance. How market cap of a company impacts variance (apparently one can use market cap as a proxy for variance).

Here's the question: given a table, with x-axis being stocks, and y-axis dates (or timestamps), and values in the table being closing prices, for example. If you calculate covariance matrix, you end up with a variance of each stock to each other stock. Eigenvectors of that matrix correspond to portfolios that are correlated.

Inverse of covariance matrix gets you the precision matrix, which you can use to get portfolios. In theory. However, if you start with historical data, you end up with an estimate of the covariance matrix.

Now, the question, how does using a sample (not population) covariance matrix screw things up?

An inverse will essentally take an inverse of each eigenvalue... so the things with the lowest eigenvalues will get the biggest precision. So stuff that varies the most and isn't correlated with anything---often tiny cap stocks---will dominate the portfolios formed from sample covariance matrix. Stuff I didn't know before the interview :-)

As for $FIN_CORP , they gave me a Spark homework to do over the weekend. Reading stock quotes from various places and doing stuff with them. Easy stuff.

- Alex; 20180316

     March 15th, 2018

So... will the Fed raise interest rates on March 20th?

In other news, had a coding (coderpad) interview with $SOCIAL_MEDIA_CORP. The question they asked: if you're given a function such as int read4k(char* buf), that reads 4k of data, and returns number of bytes read, implement another function, int read(char* buf, int siz) that will have a more flexible interface---allow for reading of any siz (including 1 byte, or much more than 4k of data). It's been a really long while since I've done buffering at this kind of level, so it took me almost 30 minutes of tinkering before I managed to setup an iteration that worked.

- Alex; 20180315

     March 14th, 2018

Happy PI day!

Happy landing anniversary!

And in other news, Stephen Hawking died. His book, A Brief History of Time was literally the first book I've ever borrowed from a library (yes, eventually I bought a copy of my own). That book inspired me to read more and more on the subject.

His primary idea, that black holes evaporate, is awesome. It's very weird how it works if you think about it... a pair of virtual particles forms just at the endge of the event horizon. One of them falls in, and somehow that causes the black hole to get smaller! (the system of the particle outside the black hole has more energy than the system with the particle already in the black hole---and that energy difference is carried off by the other particle of the virtual pair). Pretty awesome stuff.

- Alex; 20180314

     March 13th, 2018

So my car (Toyota 4Runner) was recalled due to faulty passenger-side air-bag inflator. Scheduled an appointment at and went to Queensborough Toyota to have it serviced---and after walking around their showroom for about an hour, waiting for my car to get done, they come back and say they don't have the part and will need to order it :-/

So will need to come back. Urgh. Why couldn't they have ordered the part before the appointment? They knew WHY I was coming there, they knew exactly the car and year, etc., it's not like they couldn't have known they don't have the part :-/

- Alex; 20180313

     March 12th, 2018

Uh, oh, more storms!

Took the kiddo to the B&N book store today, and for a walk around union square.

In other news...

Not pursuing the $AD_CORP_NAME offer (initial indications is that it's on the low end).

$SOCIAL_MEDIA_CORP scheduled a coding interview for Thursday.

- Alex; 20180312

     March 11th, 2018

Happy Anniversary :-D

Time flies like an arrow... and fruit flies like a banana :-)

- Alex; 20180311

     March 9th, 2018

Fixed my Ikea bed. The other day, it fell apart. Or rather, a tiny welded piece of it fell off such that the whole thing just sorta didn't hold together. Upon Suneli's suggestion, we got nuts and bolts at Home Depot (cheap stuff, 11 cents for a bolt, 6 cents for a nut, and 17 cents for a spring washer), drilled two holes right through the broken weld, in the metal piece and the leg of the bed, then just screwed the two components together---very tight. I think it's stronger now than it was when it was new.

Feeling awesome having finally done something productive with my toolbox :-)

In other news...

Had a coderpad interview with another $AD_CORP_NAME . The problem was: given a dataset of tv channel records, such as tvid, cannel, and time, find how many distinct tvids watched NBC between 10am and 11am for at least 5 minutes. Since I was allowed to use SQL, that took about 5 or so minutes to code... (lead function to find next time channel is switched, get duration, filter on channel and ducation, extract only relevant time window, then count distinct, etc.). Would've taken at least 30 minutes if using Java---and it wouldn't have scaled as well. They scheduled a followup call next Monday.

Had another chat with $CORP_NAME. Apparently they're getting ready to make an offer (didn't yet), and wanted to know details such as whether I'd need a visa, relocation funds, notice period, compensation for missed-bonus, etc., also if I've signed any non-competes or non-disclosures that they should be aware of.

In yet more news, $SOCIAL_MEDIA_CORP finally got back to me, and scheduled a call for Monday. Apparenlty the recruiter I was chatting with before moved on from HR, and it took them a while to reach out to his contacts.

- Alex; 20180309

     March 7th, 2018

So much for the storm. At least in NYC, it didn't snow nearly as badly as in other places.

Had yet another interview with $CORP_NAME. This one more business-like and less techy wise. Mostly what the role will be about, etc. Sounds like an interesting role---doing lots of very interesting and unspecific things.

- Alex; 20180307

     March 6th, 2018

Uh, oh, storm!

In other news, was interviewed by $CORP_NAME.

Had a total of 4-hours of techy-question interviews, 1-hour at a time, with two folks each. Then in the middle about an hour or so lunch with another techy. Dunno where the other hour went---perhaps buffered between meetings. Spent a total of 6 hours there (from 9am to 3pm).

They got office coffee... as much as you can drink. And I did, as much as I could :-)

They sure like questions about hash tables. At least three teams asked me to describe how a hash table works. Essentially they're looking for the java.util.Hashtable implementation---array backed with a linked list (or tree) at each location for all the entries whose hash brings them to that array index. They asked for an implementation of put and get methods (written on paper). I guess they like it because it has arrays, hashes, AND linked lists... all combined into one question. Probably a good interview filter.

They also asked for a binary search tree construction---the insert and lookup methods. There was a cursory question on how balanced trees work---they wanted to know if I knew what a pivot operation was.

As well as array representation of a binary tree using an array... e.g. given an index n, the left child is 2n+1, and right child is at 2n+2 (for arrays that start with 0).

Then there was a question on in-order, pre-order, or post-order traversals of binary trees.

How to find loops in a linked list? (perhaps the end of the list ``accidentally'' points to the inner node in the linked list). Yeah, solve that using a seen set (or a bit field if vertices can be integers). This is essentially spanning tree algorithm.

Then a question regarding some aggregate and sort---I think that interviewer didn't have a good question lined up (after his hash table question got derailed for being asked 3-times already), so just wanted to complicate the select thing,count(*) from something group by thing order by 2 question... (by saying he wanted to sort the output by the frequency of when in time thing happens---that he couldn't clearly define). In the end he accepted ordering by average time of thing. (just another aggregate, just to add to order by clause). e.g. stupid question.

More interesting question was about testing---that I really didn't have a good answer for. Or rather, I had what I think is a good answer, but I don't think they understood it, nor wanted to hear what I was saying. The question was: ``you said you wrote a pattern detection program in SQL, how did you test it?''

They were looking for ``ah, I'm a firm believer in Agile, and test driven development---so test cases for everything before development even starts. I also love SCRUM meetings!'' That would've been the end of it. But that's a bit too much bullshit.

The `problem' is that SQL is declarative. For example, lets say the requirement says: "select the first name of all customers whose last name is 'Johnson'". (yes, this is a gigantic oversimplification). And then you wrote a SQL statement (implemented the requirement) as:

select fname from customers where lname='Johnson';

and it's syntactically correct, it returns the output you expected, it's going after the correct table, etc. It works. But... but but... how did you TEST it???

How do you know it really works?

How do you know it satisfies the requirement? (they didn't ask these, but they're all fair questions).

Or better, how do you know it solves the business problem the requirement was addressing?

They were looking for some answer that involved test cases. You know Agile, test driven development. That sort of thing. urgh. But yeah, at FINRA, we did have ``test cases''... the QC/QA team verified results with know inputs and known outputs for every SQL query.

But really, think about it... what would a test case for this SQL look like? What assertion can you test that would tell you if it's working correctly?

What language (what statement) would you specify such an assertion in? Would the assertion have lname='Johnson' in there? Eh?

I guess my point is that, SQL by its nature, is declarative. Test cases are too.

Test cases say WHAT output you want given certain inputs. They don't say HOW to calculate it. You pass them once you write the code telling the computer HOW to calculate the output.

You're testing that the HOW part matches the WHAT part.

SQL is declarative: you do not tell the computer HOW to calculate the result. You just say WHAT result you want, and the database calculates it. If you test the HOW part, you're actually testing that the database product is functioning correctly (which isn't the purpose of testing).

We can step it up a notch, and say the business requirement "select the first name of all customers with last name 'Johnson'" was automatically behind-the-scenes parsed into SQL and executed by the database:

"select the first name of all customers with last name 'Johnson'" to "select fname from customers where lname='Johnson'; "

Now you get output from the database. Does that output match the requirements? When you start testing this, you're effectively testing whether the database execution engine is operating correctly---and that's not your job (might as well test the operating system too, and the compiler used to build the database, etc.)

Now back to the business requirement. Perhaps it had a `bug', how would you know? What wouldn't match where for you to notice this? These are problems test cases cannot detect---and the biggest problems in systems are these kinds of problems.

Now, I'm not saying there's no way to ``test'' a pattern recognition system (whether it's written in SQL or C++, etc.). You can... That's why there are plenty of machine learning data sets out there, with training and test data---so you could create a black-box that assigns a label to stuff, run that black box against test data... But then you're not testing against a requirement: you're testing against data.

And no, when you start with a business requirement, you often don't have data for this kind of a test. In fact, often the requirement drives the creation of the data, which if you test against could circularly confirm your requirement. E.g. that's like saying: we'll use this SQL statement to pull some data, and then see if that SQL statement pulls that data, if yes, test passed.

But... on an interview, if someone asks "how did you test it", the proper response is: ``we used test cases''---unless of course you don't feel like bullshitting.

- Alex; 20180306

     March 1st, 2018

Stopped by the physical Amazon store, and it's awesome! It's like books and little trinkets, right on the shelves. I was surprised to find that a large percentage of the books on the shelves I've actually read (apparently those are the ones that folks buy the most?) so they stock them in the physical store?

Pretty nice store. Not sure if I'd make it a point to go in there (shopping on amazon.com is just so much easier). But if you want to have a coffee and flip through a book before buying, that's a pretty good alternative to... amm... Barnes & Noble (which often has starbucks coffee :-)

- Alex; 20180301

     February 28th, 2018

Had a 2nd coding interview with $CORP_NAME. They asked to implement something that was essentially:

select avg(grade) from students group by name order by 1 desc limit 1;

In other words, find the highest average grade for all students. But had to code it in Java (not as simple as it is in SQL). Took me whole 15 minutes or so, using TreeMap for grouping, and a loop on map values to find the maximum.

Also went for an interview with $HFT_CORP. They style themselves as a HFT firm, doing market making in options. They consume prop feeds from everywhere, use lattices to calculate fair value (this part supposedly runs on GPUs), and then publish their quotes... Everything they do is mostly C++, and they don't seem to believe in the cloud. The prop feeds (mostly option-maker quotes) are about half a terabyte per day, which they store in their own interval storage thing. The interview didn't have any tech questions. Like zero. They want to build a better market-replay tool, for better and faster back-testing of trading strategies. The guy who interviewed me was a big fan of spinning disks... saying that for streaming work flows, they're just as good as solid state disks (e.g. for constant continuous reading or writing). Yep, ~100MB/second is better than ~2GB/second, but, eh, it works for them.

- Alex; 20180228

     February 23rd, 2018

Today is my last day at FINRA :-/ I've been with market regulation (first at NYSE and then at FINRA) for almost 13 years---not counting SIAC :-). The database my team built quite literally changed the industry.

Still haven't decided on what to do next. Join another place asap, or take a break and work on my PhD (thinking of a greedy optimization mechanism for n-tuple networks; similar to information gain measure in decision tree construction).

Anyways, good times ahead...

- Alex; 20180223

     February 21st, 2018


- Alex; 20180221

     February 20th, 2018

You know you're drinking too much coffee when...

- Alex; 20180220

     February 19th, 2018

Kiddo discovered loud noise, or rather, that he can make loud noise. Before, it was reasonably loud giggles, how it's volume-set-to-11 giggles... mostly whenever something fun happens on TV (he's a big fan of Sunny Bunnies on YouTube Kids).

- Alex; 20180219

     February 17th, 2018

Kiddo graduated from size 2 diapers to size 3 diapers :-D

- Alex; 20180217

     February 16th, 2018

``It doesn't make sense to hire smart people and then tell them what to do; we hire smart people so they can tell us what to do.'' --Steve Jobs

Had a phone interview with $CORP_NAME. I think it went well. They asked a coding question: imagine a kid walking up the stairs. They can hop 1 stair at a time, or two. There are N stairs. How many different ways can the kid hop up the stairs.

For example, for 1 stair, there's only one way. for 2 stairs, the kid can go 1-stair, and 1-stair, or just 2-stair (so two ways). For three stairs, the kid can do 1,1,1, or 1,2, or 2,1, so three ways total.

Managed to solve the problem in about 15 minutes. Java program. int stairs(int n){ if(n <=0) return 0; return 1 + stairs(n-1) + stairs(n-2); }

Yes, there's a non-recursive dynamical programming solution too---but they didn't ask about that.

- Alex; 20180216

     February 13th, 2018

Aveeno baby products rock! (no, they didn't pay me to say this).

- Alex; 20180213

     February 12th, 2018

Noticed that my kid started to recognize himself in the mirror. Before, he'd just look at the mirror, without much of a reaction. Now he's smiling and staring into his own eyes, and then smiling some more... :-)

- Alex; 20180212

     February 3rd, 2018

Finished Everybody Lies: Big Data, New Data, and What the Internet Can Tell Us About Who We Really Are by Seth Stephens-Davidowitz. Can't really call it a good book---it has nothing technical, more of a popular book on the cool aspects of data science. It does have some awesome examples though---like this: Microsoft Finds Cancer Clues in Search Queries. Pretty damn amazing!

- Alex; 20180203

     February 2nd, 2018

Ran a few more experiments on EMNIST. Wrote a tiny hand written digit recognizer using n-tuple classifier. 22% error rate :-/ My guess is that for printed characters this would've yielded the published 90% accuracy (which isn't great by today's standards, but for a 20-line Perl script, it's not bad).

In other news, applied for Ian's passport today.

- Alex; 20180202

     February 1st, 2018

Learned something clever today: apparently we can do partial regression. e.g. in linear system y = x1*w1 + x1*w1 + ... xN*wN, if we run multiple regressions, on x1..x5, x6..x10, etc., x(N-5)..xN, the results will be the same if we fit all the dimensions at the same time. It's actually not hard to see why this works... think back to how gradient descent would fit all these dimensions: it would adjust each weight individually. So given samples in 100D space (the projections onto lower dimensions are still from 100D space), we can run 20 regressions of 5D each, and get the same weights as if we ran regression in 100D space.

- Alex; 20180201

     January 31st, 2018

Finished The Dictator's Handbook: Why Bad Behavior is Almost Always Good Politics by Bruce Bueno de Mesquita and Alastair Smith.

Pretty nice book---in fact, the best book on politics I've ever read. Also awesome for understanding corporate dynamics (and why for example Warren Buffett says his descendants should sell most of their Berkshire shares after he's gone). Highly recommend :-)

- Alex; 20180131

     January 30th, 2018

My kid rolled from stomach back onto his back today for the first time :-)

In other news, guess I'll be leaving FINRA next month...

- Alex; 20180130

     January 29th, 2018

First day of Spring classes :-)

- Alex; 20180129

     January 26th, 2018

So apparently there will very likely be an ``Introduction to Data Science'' course at Brooklyn College in Fall 2018.

- Alex; 20180126

     January 21st, 2018

Got The Stanley Parable. It's awesome. Pretty unique game. Kind of similar to Portal :-)

- Alex; 20180121

     January 19th, 2018

Finished reading The Rise and Fall of D.O.D.O.: A Novel Hardcover by Neal Stephenson and Nicole Galland.

I'm a big Stephenson fan, and I can only imagine that this book has less Stephenson and more Galland. Really, that's the kind of books that he puts his name on these days?

Strangely, the concepts dealt with in this book were explored in Anathem: time/things unfold in a certain way, and if you know how to do it, you can steer time/things in a certain direction. In Anathem, it were the inner circle monks who could control radioactive decay (simply avoid histories where they get cancer from radiation and/or die, allowing them to live a really long time), and in D.O.D.O, it's witches (who could perform all sorts of magic, but only when isolated from the rest of the universe).

In any case, terrible book. Some parts are entertaining, but that's about it. Ending sucks too---very flat, no action or anything. Keep waiting for stuff to happen, and then it just doesn't and the book ends (come to think of it, that explains a lot of Stephenson's books---but usually there's a lot to chew on throughout the book, this one is kind of not fun :-)

- Alex; 20180119

     January 4th, 2018

Eh. Meltdown and Spectre. Speculative execution is apparently a security issue... so lets slow down everyone's computers by 30%.

Seriously though, I understand being very careful about such things in a shared environment, where you and everyone else are sharing the same machine to run virtualized stuffs... but for a personal computer mostly used for gaming... keep the security hole with the performance. (If the fix for this is pushed down everyone's throats, then it's not about rogue programs or attackers, it's about protecting DRM).

In other news, anyone claiming cloud to be "more secure than in-house hardware" should own up to their claims. This bug has been "out there" for months---months during which anyone's cloud data could've been compromised.

- Alex; 20180104

     January 3rd, 2018

Warren Buffett wins $1M bet against hedge funds and gives it to girls' charity. This should be front page news for everyone.

``In 2007, the famed billionaire investor made a $1 million bet that an S&P 500 stock index fund would outperform a basket of hedge funds over the course of a decade. The index fund returned 7.1% compounded annually over the 10-year period, easily beating the 2.2% average return of a basket of funds picked by asset manager Protege Partners.''

- Alex; 20180103

     January 1st, 2018

Happy New Year!

- Alex; 20180101

     December 28th, 2017

So... CUNYfirst pulled the plug on grade submission early this year (the deadline is usually after the new years; this year, it was December 26th). All students got a 'Z' grade, and changing it to a proper grade is apparently a big pain in the neck. Might take sometime for the registrar to enter the proper grades.

- Alex; 20171228

     December 21st, 2017

Entertaining read: Don't Let Architecture Astronauts Scare You.

- Alex; 20171221

     December 20th, 2017

Entertaining read: The Ten Fallacies of Data Science.

- Alex; 20171220

     December 11th, 2017

Hmm... CBOE Futures Exchange launched trading of ``Cboe bitcoin futures'' ticker symbol XBT. This will not end well. It seems bitcoin fans don't realize how futures work, and futures fans don't realize how bitcoin works.

- Alex; 20171211

     December 6th, 2017

Last day of database class. Yey.

- Alex; 20171206

     December 3rd, 2017

Ok, so every year I'm disappointed by the quality of ``hackers'' at the YHack. This year, things hit a new level. How could a senior in computer science from Some-Ivy-School not know what happens when you click the submit button on a web-page... I mean, that there's a request to some sort of server, etc. Yes, that may not be part of the course load, but if you're techy enough to get a computer science degree, how can you not be just a bit curious about how the web works? Perhaps students were just half sleeping, but I saw some pretty amazing questions, such as `why is this .csv file not in json format'. The other issue is that many (too many) students seem to have a Macbook---and yeah yeah, it's all ``unix'', but unfortunately Macbook users I've seen have very little unix knowledge. So they're trying to build something on a machine they can only point-and-click on...

- Alex; 20171203

     December 2nd, 2017

Spending weekend at YHack.

A surprising number of students, when asked ``do you own stock'' answered ``no, but I own bitcoin''. That's scary!!!

- Alex; 20171202

Or, you can directly go to a desired entry.

NOTICE: We DO NOT collect ANY personal information on this site.
© 1996-2016 by End of the World Production, LLC.