Notes On Technology: 2012

Thursday, October 18, 2012

"redhat_transparent_hugepage" can hurt Java performance

Most modern operating systems utilize the paged virtual memory model. In this model each process has its own "virtual" address space and special page table is used to map each virtual memory address to the corresponding physical memory address.

Mapping of each byte of virtual memory to physical address would be very inefficient because for each allocated byte of we need to keep 4+ bytes entry in the page table. In practice memory is allocated by bigger chuncks called pages. Most common page size is 4096 bytes.

There is a trade-off between page size and page table size. For small pages we are wasting memory to keep huge page table. For big pages we're wasting memory on partially-used pages, because it is very rare when a memory allocation request corresponds to exact number of pages and last allocated page is only partially used.

Huge pages is a memory allocation mode when page size is bigger than 4kB. Actual page size depends on the OS and hardware platform and for most of x86 systems it can be either 2MB or 4MB. If a process operates with big blocks of memory and a computer has enough physical memory installed, then "huge pages" mode can significantly improve performance (up to 3x, according to some syntetic tests).

On Linux, if a process wants use "huge pages" support, it should use the libhugetlbfs OS API funtions to request it, i.e. program should be written and compiled with "huge pages" support in mind.

In order to bring "huge page" performance benefits to legacy software and software written without "libhugetlbfs" support, RedHat has implemented custom Linux kernel extension called "redhat_transparent_hugepage".

As I will show below, transparent huge pages is not always a good idea, even if you have more then enough physical memory installed.

To perform automatic Continious Integration builds of our Java software we're using a quite powerful server machine with 2 x Intel(R) Xeon(R) CPU X5660 @ 2.80GHz (2 CPU * 6 cores * 2[HT] = 24 threads), 36GB of memory and RHEL-based Scientific Linux with 2.6.32-279.9.1.el6.x86_64 kernel.

Despite the reasonable number of processors and only 4 concurrent builds running in parallel, the machine was showing a very high CPU average of 35% or more and was plagued with random freezes when almost everyting got blocked for 3-5 seconds.

After throughfull testing and analysis we noticed two interesting facts. First, CPU spends most of its time in the kernel mode. And second, for very short intervals of time "hugememd" daemon eats all CPU power.

Based on this symptoms, it was easy assumption that "hugememd" is strongly correlated with the Java performance degradation and system freezes, what led us to the existing bug report on the CentOS bugtracker.

And indeed, disabling "redhat_transparent_hugepage" extension eliminates the problem completely! Now average CPU rarely goes higher than 5% and server can keep up to 8 parallel Java builds.

It is still not clear to me if the problem is the way JVM manages memory what leads to huge "memory defragmentation" costs or if there is a bug in "redhat_transparent_hugepage" implementation, but the fact is that at the moment "redhat_transparent_hugepages" doesn't work well with Java.

This means that if you are experiencing surprisingly bad perfomance of your Java web server or database with accidental freezes or slowdowns, don't rush to blame Java. It may be your server OS doing some nasty "optimizations" behind you.

Wednesday, October 17, 2012

Apache Ant: customizing JUnit tests bechaviour with system properties

Despite increasing popularity of Apache Maven, Gradle and other modern Java build automation tools, considerable amount (if not the majority) of Java projects are still based on Apache Ant. And vast majority Ant-based builds are using Ant JUnit task to run the tests.

Sometimes it may be necessary to supply some optional configuration to the JUnit tests and developer may not want to store this configuration in the source code control system. Just to name few use cases, it may be a credentials used to establish connection to the testing database or (in case of integration tests) even an user name and password necessary to verify communication with "external" system components in production.

By default Ant JUnit task runs all tests in the same JVM instance and all system properties (including the properties set in command line as "-Dprop=value") are accessible from the tests code. But running tests in the same JVM can cause undesirable side effects.

The easiest way to isolate tests from Ant environment is to set the "fork" attribute of JUnit task to "true". But then all the tests will run in the "clean room" environment and custom properties supplied to Ant build will not be accessible.

Fortuantely, it is easy to instruct JUnit task to pass all (or just some) system properties to the forked JVM by using "property sets".

For example, to pass all system properties to the JUnit test JVM instance, you may use the property reference to builtin set "all":

Thursday, October 11, 2012

EGit "not authorized" error pushing changes to github using EGit

I just lost half an hour trying to push local changes to one of my GitHub repositories over HTTPS.
Despite the fact that I done this many times before and I have a proper write permissions and had specified correct username and password when cloning the repository, I was constantly receiving "not authorized" error!

It appeared that for some reason EGit ignores my credentials when performing a push over HTTPS, in particular EGit doesn't really use the credentials unless they are stored in the Eclipse "Secure Storage".

If you experience same problem, try the following:

click "Configure" button at the bottom of the "not authorized" error dialog
click "Change" button next to URI field
re-type your username and password in the "Authentication" box
(!) set "[x] Store in Secure Store" check box

Thursday, October 4, 2012

Pre-configuring Eclipse JadClipse Java decompiler using Genuitec "Secure Delivery Center"

Genuitec "Secure Delivery Center" (SDC) allows you to provide customized and pre-configured and centrally managed Eclipse distributions for your team(s), behind the firewall.
JadClipse is an Eclipse plug-in that automatically gives you a decompiled version of Java source code for any .class file you have.
This is invaluable when you don't have source code packages for your 3'rd party dependencies.
The current version of JadClipse (v3.4) doesn't include Jad decompiler and requires external Jad decompiler binary to be present somewhere on your system.
Let's assume you already have an SDC-managed Eclipse package. In my case JadClipse was configured on top of Eclipse 3.7.2 SDK based package.

Mirror JadClipse in SDC as "JadClipse"
First of all, you will need to mirror JadClipse as 3'rd party Eclipse extension. In SDC Admin Console, go to Third Party Libraries -> Import New Library -> Import existing Eclipse update site -> Add Source Site.
Use "http://jadclipse.sf.net/update" as URL and choose "JDT Decompiler Feature" from the list of available plug-ins.

Mirror Jad decompiler binaries in SDC as "Jad Decompiler"
Please download Jad decompiler binary for all platforms you want (Windows, Linux, Mac). You don't need to extract .zip archives.

In SDC Admin Console, go to Third Party Libraries -> Import New Library -> Package binary contents for delivery
For each downloaded downloaded jad*.zip archive do "Add binary contents", select .zip archive and mark platform (Windows, Linux or Mac) it's designed for.

Pre-installing new software into Eclipse package
Add "Jad Decompiler" and "JadClipse" to the Eclipse package software list, build and install the package (you may want to use "Test" build first).

Pre-configuring Jad binary location in JadClipse
Open Eclipse installation directory and locate Jad binary location. Typically SDC install it as ECLIPSE_HOME/binary/binary.contents-x.y.z/jad[.exe].
Copy & paste full absolute name of Jad binary to Eclipse - Window - Preferences - Java - Decompilers - Jad - Path to decompiler.
You may also want to check [x] Use Eclipse code formatter on the "Decompilers" page.

JadClipse stores all these settings as Eclipse preferences, so they are easily configurable with SDC.

Related property names are:

/instance/net.sf.jdtdecompiler.jad/net.sf.jdtdecompiler.jad.cmd
/instance/net.sf.jdtdecompiler.ui/net.sf.jdtdecompiler.use_eclipse_formatter

You may find all property names and values by opening Eclipe - Help - About Eclipse - Installation Details - Configuration dialog.

For each supported platform create new text file called "jadclipse-<platform>.epf" using your favorite text editor. Put "file_export_version=3.0" as a header and then property values, one property by line.

Here is the example for Windows platform. Please note that in your case Jad binary location will be different!

file_export_version=3.0
/instance/net.sf.jdtdecompiler.ui/net.sf.jdtdecompiler.use_eclipse_formatter=true
/instance/net.sf.jdtdecompiler.jad/net.sf.jdtdecompiler.jad.cmd=C:/Software/eclipse/binary/binary.contents.9391-Ahb-8849.win_1.5.8/jad.exe

If you need to support more than one platform, you'll need to create an SCD "Environment Policy" and assign it to the Eclipse package. In the environment policy configuration there is a dedicated configuration page for each platform, so you can easily specify platform-specific preferences.

Test and promote package changes

Basically, that's it. After promoting package changes all developers in your team will automatically get pre-configured JadClipse working out of the box.

Wednesday, October 3, 2012

Oracle: nested query with data set extension in FROM clause

Oracle (as well as MS SQL and some others) allows nested queries in FROM clause of the SELECT statement. Being combined with data set extension using UNION keyword, this can greatly simplify simplify complex table joins.

Let's imagine we have two simple tables:

PERSONS
id name
1 John Smith
2 Mike Douglas

COMPUTERS
id name main_user_id
1 Computer1 1
2 Computer2 2
3 Computer3 NULL

Now we want to have a full list of computers with appropriate main user names.

SELECT c.name as "computer", p.name as "user"
FROM COMPUTERS c, PERSONS p
WHERE c.main_user_id = p.id

> computer user
> Computer1 John Smith
> Computer2 Mike Douglas

To get full list of computers we could use outher join:

SELECT c.name as "computer", p.name as "user"
FROM COMPUTERS c, PERSONS p
WHERE c.main_user_id = p.id (+)

> computer user
> Computer1 John Smith
> Computer2 Mike Douglas
> Computer3 NULL

Or, we can use a nested SELECT to extend PERSONS table by an "unknown" user in the context of this query:

SELECT c.name as "computer", p.name as "user"

FROM
COMPUTERS c,
(SELECT *
FROM PERSONS
UNION
SELECT 0 AS "id", 'unknown' AS "name"
FROM DUAL) p
WHERE c.main_user_id = p.id

> computer user
> Computer1 John Smith
> Computer2 Mike Douglas
> Computer3 unknown

This is much more verbose way in comparison to simple outher join, but it gives us additional level of control on the input data and actually simplifies WHERE clause.

Tuesday, August 21, 2012

How to send email to a group and exclude members of some other groups

Imagine the situation - you just sent an announcement e-mail to a few departments of your organization. For simplicity, let's assume that you have used appropriate LDAP groups "A", "B" & "C". And then you realize that you have forgotten to add one more group "D" to the recipients list. And this forgotten group "D" contains some users from "A", "B" or "C" as well as some unique users.

You don't want to "spam" dozens of guys from "A", "B" & "C" again, don't you?

Unfortunately, Outlook doesn't have "Not to" field, so you have to prepare a filtered list of recipients:
"D*" = "D" - ("A" + "B" + "C")

Use Outlook to expand "A", "B" & "C" groups to the list of users
Copy ";" separated list to Word and replace all "; " by line breaks "^l"
Put resulting list to Excel as first column
Repeat the same steps for group "D" and put resulting list as second column
In Excel, select it all and choose "Conditional Formatting" - "Higlight Cell Rules" - "Duplicate Values"
Filter second column to only show "unique" or not highlighted values
Et viola - You have a list of people who are in "D" but not in "A", "B" or "C"!
Copy the list back to the Outlook message in the "To:" field and click "Check names"

Tuesday, August 14, 2012

Exporting and importing Eclipse workbench layout

Eclipse workbench layout defines which workspaces, views, shortcuts, toolbars and menus are active, as well as their size and position.

Eclipse 3.x doesn't store workbench layout in preferences, so the only way to share same layout among multiple workspaces is to copy ".metadata/.plugins/org.eclipse.ui.workbench/workbench.xml" from one workspace to another.

Wednesday, August 1, 2012

Eclipse: generate JUnit test for the method using shortcut

By default Eclipse only can generate JUnit test class for an existing Java class. If you'll change original class by adding new method you'll need to manually add appropriate test method to the JUnit test.

Using free FastCode Eclipse plug-in you will be able to do that by the single shortcut "Ctrl + Alt + Shift + U"

Saturday, June 16, 2012

Editing Gnome system menu

A Gnome menu editor suprisingly called "alacarte".

Installation:
sudo yum install alacarte gconf-editor

Thursday, June 7, 2012

Installing Atlassian Plugin SDK side by side with existing development environment

Atlassian Plugin SDK requires its own Maven settings, JDK version and set of environment variables (like ATLAS_HOME) to work.

If you're already an advanced user of Eclipse and Maven you may already have a customized setup that you don't want to modify in favor of Atlassian PSDK requirements.

It's possible to isolate all Atlassian setup with a simple shell script that will create a temporary new shell with proper environment for Atlassian plugin development and without any effect on the rest of your system.

Recommendations are for Windows platform, but it should be easy enough to apply them for Linux users.

Configuration instructions

Choose a location for PSDK files
Download Atlassian Plugin SDK
Configure PSDK Maven to use custom user settings location

Edit %ATLAS_HOME%\apache-maven\bin\mvn[.bat], modify command under the ":runm2" label by adding
"--settings %M2_HOME%\conf\user-settings.xml"
at the end of the command line

Create apache-maven\conf\user-settings.xml file

<?xml version="1.0" encoding="UTF-8"?>
<settings xmlns="http://maven.apache.org/settings/1.0.0" 
          xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
          xsi:schemaLocation="http://maven.apache.org/SETTINGS/1.0.0
          http://maven.apache.org/xsd/settings-1.0.0.xsd">
  <localRepository>${env.ATLAS_HOME}/repository</localRepository>
</settings>

Create shell scriptset

JAVA_HOME=C:\dev\atlassian\jdk1.6.0_31
set ATLAS_HOME=C:\dev\atlassian\atlassian-plugin-sdk-3.10.4
set M2_HOME=%ATLAS_HOME%\apache-maven
set M2_REPO=%M2_HOME%\repository
set M2=%M2_HOME%\bin
set PATH=%JAVA_HOME%\bin;%ATLAS_HOME%\bin;%M2%;%PATH%
cmd

Test that PSDK works
atlas-run-standalone --product jira
Download Eclipse IDE for Java EE Developers
Start Eclipse and configure workspace for your Atlassian projects

Optional steps

Install m2eclipse, SVN, etc. and configure m2eclipse to use "external" PSDK Maven installation and user settings file.
Download and attach Atlassian product source code. You need to have an account on http://my.atlassian.com to do that.

Eclipse and m2e configuration troubleshooting

Error "Build path is incomplete. Cannot find the class file for com.atlassian.*"
Solution Verify that Window - Preferences - Maven - User settings - Local repository points to the same location that is specified in conf\user-settings.xml.
Error "Plugin execution not covered by lifecycle configuration"
Solution Use "quick fix" suggestion to ignore all such errors

Saturday, May 19, 2012

Carnegie Mellon Robotics Academy

Lego Mindstorms and VEX robot building instructions, resources, tutorials
http://www.education.rec.ri.cmu.edu/content/lego/index.htm

Sunday, May 13, 2012

Using SLAM algorithm to map surrounding environment and determine location of NXT Rover Explorer

Initially a robot have no idea where it is, how it is oriented in space and what an environment looks like. This means that a robot will have to explore its initial position, recognize useful landmarks and use them to get certainty about its localization.

Robot has no idea about its location, environment and position of ultra sonic sensor

After initial localization step it will be possible to command a robot to got to a point in the unexplored space, defined relative of robot "home" location.

Robot got some data from radar that could be used as constraints to estimate possible location. Still, many locations are possible, so robot isn't localized yet.

While moving to defined point a robot should detect and avoid any obstacle, build environment map and use it for localization purposes.

SLAM demo:

Some SLAM implementations

Reference implementation of Sebastian Thurn's group's Fast SLAM algorithm in Java
EducationalRobots project on Google Code (contains Fast SLAM implementation)

Further resources

ROS (Robot Operating System) project
OpenSLAM set of SLAM related projects, algorithms and implementations
MyRobotLab project on Google Code
Bashir's Robotics Vision project Using a robot equipped with video camera to generate 2D environment map

Saturday, May 12, 2012

Rover Explorer Robot building instructions

All main robot components are from standard Lego Mindstorms NXT 2.0 kit. Few additional Lego bars (black) were used to make costruction more solid, but they can be replaced with lighter parts from NXT 2.0 set.

Complete robot view

Building instructions

Servo for left track (partially assembled)

Complete servo for left track

Left track with servo, assembled

Both left and right tracks and servos assembled

Connecting robot base parts together (side view)

Connecting robot base parts together (back view)

Turnable (servo) platform with ultrasonic sensor, parts

Ultra sonic sensor platform, assembled

Front view of robot base just before sonic platform installed

Robot base with ultra sonic platform installed

Fixings for NXT Brain

Robot base with NXT Brain fixings, assembled

NXT Brain mounted on the robot platform

Installing contact and color sensor, front view

Contact sensor on the back

Complete NXT Rover Robot (no wires connected yet)

NXT Rover Robot mechanical test using NXT Remote

Rover Explorer Robot (Mindstorms NXT 2.0, LeJOS)

Inspired by design of

and some code pieces from the "legointernational" project on Google Code I decided to build NXT Mindstorms 2.0 and LeJOS based rover explorer robot matching two main design goals:

maximum sensing capabilities available in standard NXT 2.0 kit: touch, ultrasonic and color sensors
durability and ruggedness: use tracks instead of wheels
simplicity: minimize the number of components

Step 1. Building Rover Explorer Robot platform
Step 2. Implement ultrasonic sensor based feature detection
Step 3. Implement Fast SLAM (Simultaneous Localization and Mapping)
Step 4. Implement environment exploration and motion planning
Step 5. Real-time data exchange with PC and graphical dashboard

Getting insight on Eclipse usage in your organization

Starting from version 3.4 (Ganymede) Eclipse includes Usage Data Collector (UDC) feature that installs listeners on the Eclipse workbench and gathers anonymous information about user actions - activation of views and perspectives, plug-in versions, etc.

While the UDC project raised some privacy concerns among world-wide community of Eclipse users and appeared to be not so useful as expected for Eclipse.org, it isn't really "an evil" thing and may be a valuable source of information for smaller domains like single organization, department or team or for getting automatic usage feedback for a RCP-based application.

By default UDC stores all usage data in CSV format in
.metadata/.plugins/org...epp.usagedata.recording/upload#.csv
and regularly sends all data to Eclipse.org, but this can be changed by using run-time property
-Dorg.epp.usagedata.recording.upload-url=http://your-internal-udc-server

To avoid tracking privacy-sensitive information like user name, computer name, etc. Eclipse UDC creates two neutral UUIDs to identify particular installation and workspace:

$HOME/.org.eclipse.epp.usagedata.recording.userId
$WS/.metadata/.plugins/org.eclipse.epp.usagedata.recording/.org.eclipse.epp.usagedata.recording.workspaceId

There are several possibilities to collect usage data from Eclipse and RCP application instances:

An open source Eclipse Usage Data Collector Server implementation on project Kenai (aka Java.net). Follow the UDC server configuration instructions.
The original Eclipse server-side UDC code (PHP) available as an attachment to Eclipse Bug 221104

Alternatively, you may decide to use org.eclipse.epp.usagedata.recording.uploader extension point and provide your own uploader.

Additional Resources

The Eclipse Packaging Project and its Usage Data Collector in RAP and RCP Applications by Markus Knauer (EclipseSource)
Discussion for Eclipse Bugzilla 347069 - Eclipse UDC feature "retirement" and some valid use cases
Coding Spectator plug-in by Illinois University - similar idea to Eclipse UDC, focused on JDT coding habits, refactorings, etc.
Eclipse UDC for SharpDevelop IDE

Wednesday, May 9, 2012

Maven and Tycho for Eclipse Plugin development

Tycho is a plugin for Apache Maven that adds support for "eclipse-plugin", "eclipse-feature" and "eclipse-repository" generation.

Apache Maven itself is an attempt to enforce standards on the Java builds, project directory layout and dependencies between different projects. Based on this infrastructure, Tycho is an attempt to extend this kind of standardization to the Eclipse and OSGi builds.

Here is a very nice presentation by Karsten Thoms about using Tycho for Eclipse development
http://www.slideshare.net/kthoms/maven-3-tycho

Wiki pages on Eclipse.org
http://wiki.eclipse.org/Category:Tycho

Tuesday, May 1, 2012

Java Technology Tutorials

Very useful places to check when you need to quickly get into a new Java technology or just refresh your memories about Java, Eclipse, Spring, Android or Web development.

"Java, Eclipse, Android and Web programming" by Lars Vogel

http://www.vogella.com/

"Java web development tutorials"

http://www.mkyong.com/

Slow WiFi re-connection with Asus Transformer 101

After automatic update to Android 4 ICS my Asus Transformer 101 became very slow at re-connecting to WiFi hot spot, sometimes up to one minute delay before it gets connected. If you have similar problem, try to clean a list of registered WiFi networks and remove all unnecessary remembered connections.

Update: problem returned after device was updated to Android 4.0.3 (2.6.39.4)

Wednesday, April 18, 2012

Broken Eclipse "External Tools Configuration" after JDK change

After replacing JDK from 1.6 to 1.7 I got a problem with Ant scripts in Eclipse:
"Specified VM install not found: type Standard VM, name jdk1.6.0_21"

Unfortunately, this is all information Eclipse gives, so it took some time to realize that Eclipse uses specific JRE name instead of "Default Workspace JRE" in the "External Tools Configurations" (in contrast with "Run & Debug Configurations").

Solution:
Run menu -> External Tools -> External Tools Configurations -> JRE tab -> [x] Run in the same JRE as the workspace

Thursday, April 12, 2012

Usability problems with Eclipse IDE running on Gnome, Linux

Few tricks to get better experience using Eclipse on Linux with Gnome environment.

Enable icons in menus:
gconftool-2 --type boolean --set /desktop/gnome/interface/menus_have_icons true

Disable Gnome keyboard shortcuts (many of them are conflicting and overriding default Eclipse key bindings):

cd /usr/share/gnome-control-center/keybindings
for entry in $(grep KeyListEntry * |cut -d'/' -f2- |cut -d'"' -f1); do
echo $entry
gconftool-2 --type string --set "/$entry" ""done

This might not be enough for some keys, for example F10. You may find more details here.

Restore all default Gnome keyboard shortcuts (based on this post on askubuntu.com):

cd /usr/share/gnome-control-center/keybindings
for entry in $(grep KeyListEntry * |cut -d'/' -f2- |cut -d'"' -f1); do
echo $entry
gconftool-2 -u "/$entry"
done

Wednesday, April 11, 2012

Computer science courses from Stanford and Udacity

Recently I've discovered two amazing "free online universities" that give everyone a chance to strengthen computer science skills in such fields as artificial intelligence, machine learning, robotics, natural language processing etc. You may attend the courses at any convenient time and there are lots of interactive quizes all over the way to make you confident about what you've learned.

Coursera - free online courses from Stanford, Berkeley, etc.
Udacity - free original interactive online courses

Setting up an aggregated Eclipse update site for your team with b3 aggregator

b3 Agregator is an Eclipse application used to generate main Eclipse Helios and Eclipse Indigo update sites.
It might be very useful if you need to make an aggregated Eclipse update site with pre-configured set of plugins and categories for your team or just simply want to make Eclipse update sites available behind the corporate firewall.

One of the main requirements behind the b3 aggregator was the generation of validated and consistent update site, that contains only such items that are compatible and could be installed together.

Because the majority of Eclipse plugins are singletons, this means that b3 aggregator will mirror only a one, typically most recent version of a plug-in. In other words, b3 aggregator can't do a "plain mirroring" of all available plug-in versions.

Install b3 Aggregator:

Download Eclipse SDK to the any machine that has Internet access
Install b3 Aggregator Editor by using http://download.eclipse.org/modeling/emft/b3/updates-3.7 as an update site

Create and configure an aggregation:

Create new general Eclipse project
Add new "Repository Aggregation" file
In the Aggregation Editor select Aggregation node and define "Build Root" (location where you want to store resulting update site) and "Label" properties
Using the Aggregation node context menu add new "Validation Set" and call it "Eclipse Core"
Under the "Validation Set" create new "Contribution" and give it the same label "Eclipse core"
For the "Contribution" create new "Mapped Repository" and specify desirable main Eclipse update site location as a URL (i.e. http://download.eclipse.org/releases/indigo)
Under the Aggregation node add one or more "Configurations" or platforms you want to support (i.e. "win32, x86", "linux, gtk, x86_64", etc.)

That's it, now you should be able to build your Eclipse update site!

Optionally, you may also:

Configure one or more "Custom Categories" (i.e. "Recommended software" for your team) under the "Aggregation" node

It is necessary to specify a non-empty "Label", "Identifier" and "Description" to avoid error messages in client's Eclipse log. See Bug#362894 for details.

Repeat steps 4 - 6 for if you want to aggregate and mirror more update sites

Don't forget to make additional "Validation Sets" to extend the "Eclipse core", otherwise b3 will complain about missing dependencies

Define "Valid Configuration Rules" for platform-dependent extensions, like "epp.package.linuxtools"