If you get this message when trying to insert data into a PostgreSQL database:
ERROR: duplicate key violates unique constraint
That likely means that the primary key sequence in the table you’re working with has somehow become out of sync, likely because of a mass import process (or something along those lines). Call it a “bug by design”, but it seems that you have to manually reset the a primary key index after restoring from a dump file. At any rate, to see if your values are out of sync, run these two commands:
SELECT MAX(the_primary_key) FROM the_table;
If the first value is higher than the second value, your sequence is out of sync. Back up your PG database (just in case), then run thisL
SELECT setval('the_primary_key_sequence', (SELECT MAX(the_primary_key) FROM the_table)+1);
That will set the sequence to the next available value that’s higher than any existing primary key in the sequence.
Program order rule. Each action in a thread happens-before every action in that thread that comes later in the program order. Monitor lock rule. An unlock on a monitor lock happens-before every subsequent lock on that same monitor lock. Volatile variable rule. A write to a volatile field happens-before every subsequent read of that same field. Thread start rule. A call to Thread.start on a thread happens-before every action in the started thread. Thread termination rule. Any action in a thread happens-before any other thread detects that thread has terminated, either by successfully return from Thread.join or by Thread.isAlive returning false. Interruption rule. A thread calling interrupt on another thread happens-before the interrupted thread detects the interrupt (either by having InterruptedException thrown, or invoking isInterrupted or interrupted). Finalizer rule. The end of a constructor for an object happens-before the start of the finalizer for that object.
hflush: This API flushes all outstanding data (i.e. the current unfinished packet) from the client into the OS buffers on all DataNode replicas.
hsync: This API flushes the data to the DataNodes, like hflush(), but should also force the data to underlying physical storage via fsync (or equivalent). Note that only the current block is flushed to the disk device.
When you use OutputStream.flush, it does not guarantee the data to be written to disk, just flush it to OS. Better to use FileOutputStream.getChannel().force(true) or FileOutputStream.getFD().sync() to guarantee the persistency, performance might not be good.
Special Thanks to Yongkun. He wrote very good blog post. 
The value 31 was chosen because it is an odd prime. If it were even and the multiplication overflowed, information would be lost, as multiplication by 2 is equivalent to shifting. The advantage of using a prime is less clear, but it is traditional. A nice property of 31 is that the multiplication can be replaced by a shift and a subtraction for better performance: 31 * i == (i << 5) - i. Modern VMs do this sort of optimization automatically.
(from Chapter 3, Item 9: Always override hashcode when you override equals, page 48, Joshua Bloch’s Effective Java)
Snappy is a compression library that can be utilized by the native code. It is currently an optional component, meaning that Hadoop can be built with or without this dependency.
Download and compile snappy codecs. or you can install from your distro repo. I installed libsnappy and libsnappy-dev packages from Ubuntu repo. If everything is fine you can use -Drequire.snappy to fail the build if libsnappy.so is not found. If this option is not specified and the snappy library is missing,silently build a version of libhadoop.so that cannot make use of snappy. After than You just need to enter below command:
Copy two files from source directory to: PENTAHO_INSTALL_PATH/lib/ phoenix-core-4.3.1.jar phoenix-4.3.1-client.jar
Create a new project in Pentaho: File -> New -> Transformation
From left pane select **Design -> Input -> Table Input **and drag it to your transformation
Double click to your table input and give a name to your step
Click new next to Connection select box to create a new database connection
Give your connection a name (Ex: Phoenix) Connection Type: Generic Database Access: Native (JDBC) Custom Connection URL: Your ZooKeeper Hosts (Ex: jdbc:phoenix:localhost:2181:/hbase) Custom Driver Class Name: org.apache.phoenix.jdbc.PhoenixDriver And then click Ok to close database connection settings popup