The dumb things smart people believe

May 29, 2020

Smart people aren’t what they’re made out to be. They make the same stupid mistakes everyone else does. They’re just better at hiding it.

I’ve started a new position and I’m expanding my knowledge. In this case they’d like a Data Vault design. I’m not a data architect so I had no idea what it looks like. I’m game to learn though.

So I start reading. The descriptions are so full of rebranded terminology (C) I can’t make any sense of it. I drill down trying to learn what a “satellite table” means to a data architect. I find almost every one of the descriptions is a copy and paste, or a copy rewrite, of the same description.

I’m finding a confusing mishmash of ideas about how to describe the data and how important “data governance” is but an amazing lack of specifics in most cases.

During my research I come across one of the stupidest things I’ve heard a smart person say:

“Architectures should not be compromised due to technology or implementation problems in the engine layers.”

This is a wonderful example of “smart people speak.” (sometimes known as “technobabble” or just “bullsh*t.” Translated this means:

“Reality shouldn’t be allowed cramp my perfect design.”

Wow.

I have to wonder how this guy got this old without reality cramping his style.


Part 2: Timey Wimey

February 19, 2020

Part 1 of this series is here

I wrote a monitor program periodically check  the status of business processes. Here are my requirements:

  • It needs to be a windows application
  • It needs to be maintainable by programmers familiar with C#
  • I need to monitor several hundred processes.
  • I don’t want this application to burn a lot of CPU cycles or resources.

The most obvious way to monitor a process is to create a timer. I can attach a subroutine to it that interrogates the process and display’s it’s state. The most obvious solution is creating a timer for each of them. I have hundreds of them to monitor.

The hardware in a general purpose computer doesn’t have hundreds of timers on any cpu. I’ve never heard of peripheral chips for timers either. The operating system simulates timers for the user. In the old versions of windows it’s not even possible since there was a limit of sixteen timers.

Hidden deep in the bowels of Windows is a list of timers the user has created that it must manage. I wanted to learn if I could do it more efficiently.

The obvious solution to the problem is to use a single timer.

  • Calculate the running elapsed time.
  • Examine each entry in the list of user created timers.
  • If their requested time has elapsed call the associated code.

The issue with this is the timer is called a LOT. These calls each generate a small amount of overhead. CPU cycles are wasted doing nothing, heat is generated, and batteries lose a tiny bit more charge.

In Part three I’ll show a more efficient method.

 


Multi tasking process monitoring

February 9, 2020

I recently had a need to check on the status of many processes from a single computer application. It’s a simple enough to do but the simple solution has problems.

  • It needs to be a windows application
  • It needs to be maintainable by programmers familiar with C#
  • I need to monitor several hundred processes.
  • I don’t want this application to burn a lot of CPU cycles or resources.

I wrote a very nice solution to this problem but I’m curious how others would approach it. There’s always room to learn.

I’ll outline my solution in following posts.


Home Assistant for your IoT

February 6, 2020

Of course, being a tech geek, I’m interest in IoT. My car reminds me when I’ve left the lights on. Why shouldn’t my home look out for me too?

The problems with the “internet of things” are many.

  • You’re tied to cloud services that can be revoked at whim.
  • The devices were poorly designed and without security in mind.
  • If your internet connection is down so is your house.
  • All of them want you to install an app on your smart phone.

I stumbled on the “Home Assistant” project. It’s open source, free, and I loved the founder’s vision. “Good home automation never annoys but is missed when it is not working.

I’m experimenting with it. I’ll post about the journey.

 


Flashed the Sonoff Touch with Tasmota today.

January 13, 2020
MQTT to your IoT Home Assistant server.
SonoffFlash
Tasmota

JSON encoding in BizTalk

January 9, 2020

The JSON output from BizTalk got you down?

Did you do what I did? Create a pipeline and drop in a JSON encoder. Then find out it’s giving you something, but not what you wanted? Never fear intrepid programmer!

  • Open that pipeline back up. Add an XML assembler into the ‘assembler’ step (BEFORE the JSON encoder)
  • Create a schema for the JSON output you want.
  • Add that JSON schema to the document schemas list in the XML assembler.
  • Deploy, test, and rejoice!

I know it’s strange but that’s how it works.

P.S. JSON does not require a single root node element like XML. You’ll need one in the schema that describes the output. If your target doesn’t want it just name the root node whatever you like. Go to the JSON encoder in the pipeline. Check the box labeled ‘Strip root node.’

Have a great new year!

Jay


Shell script to rename files from United States dates to international dates

November 13, 2010

I ended up with a lot of files named using US style dates ( month-day-year.txt ). These don’t sort nicely into chronological order so I wrote a small shell script to rename them. The files are changed from “month-day-year.txt” to “20year-month-day.txt”. The extension isn’t checked and is preserved so it works with any file extension. This does require a relatively modern version of Linux/Unix bash shell.

#!/bin/bash
regex=^[0-9]\{1,2\}-[0-9]\{1,2\}-[0-9]\{1,2\}\..*
for f in `ls *`
do
 if [[ $f =~ $regex ]]; then
 m=0`expr match "$f" '\(^[0-9]\{1,2\}\)'`
 m=${m:(-2)}
 d=0`expr match "$f" '.*-\([0-9]\{1,2\}\)-.*'`
 d=${d:(-2)}
 y=0`expr match "$f" '.*-\([0-9]\{1,2\}\)\..*'`
 y=${y:(-2)}
 e=`expr match "$f" '.*\.\(.*\)$'`

 echo $f "20$y-$m-$d.$e"
 mv $f "20$y-$m-$d.$e"
 fi
done

++djs


The delimited list builder pattern

October 14, 2010

If you’re building a delimited list, for your sql injection attack demonstration or just output data dumps, here’s a useful pattern to build it.

I’ll present it as (bad) C++ pseudo code with an explanation to follow. For this example I’ll assume the delimiter is a comma.

string output;
string Delimiter = ",";
list<string> values;
string separator = "";
for ( int i = 0; i < values.size(); i++ )
{
  output += separator + values[i];
  separator = Delimiter;
}
<pre>

You can’t simply add a delimiter after each list element. If you do then the last element will always have an incorrect
delimiter at the end. If you prefix each insertion with the delimiter then you end up with an incorrect delimiter before
the first element in your output.

This algorithm solves the problem by using the prefix method and changing the delimiter. The first insertion is done
using an empty delimiter to avoid the first delimiter problem. All insertions after the first use the correct delimiter.

For optimal performance you should use a reference/pointer to set the delimiter within the loop (if your language of choice
allows it). This avoids a memory copy operation to set the string for each iteration of the loop. An if statement could be used
instead of changing the separator but the cost of setting a pointer will be less than evaluation of a condition and a branch.

This algorithm works correctly if the size of the list is not known in advance and if the output cannot be edited once written to.

++djs


Using scope in C#

October 8, 2010

Microsoft managed code uses the garbage collector to reclaim space when objects go out of scope. This is a great advantage in that it frees the programmer from worrying about releasing resources. The programmer can’t predict when garbage collection will occur though. Unfortunately that also prevents you from using scope to your advantage. You can’t just put code in the destructor and assume it will be called when the object goes out of scope. If you created a mutex object (like I did in a previous post about C++ scoping) and left the unlock in the destructor you’d have no way to guarantee when your mutex would get unlocked.

There’s a handy trick to get around that problem. You can use scoping to your advantage if you combine the “using” keyword and the IDisposable interface. The IDisposable interface adds a Dispose() method to your class. The using keyword guarantees to call the Dispose method of any objects that are created using the keyword. The combination of the two of them does what the C++ scoping does.

Here’s an example of a class that is useful for winforms programmers. If you’re doing a rigorous job you know you should change the cursor to provide a visual indicatation the computer is working on a task. It’s a pain to ensure that the cursor is always set back to it’s original state when you finish the task. If an exception occurs you can end up with the user thinking the machine is still working on the task when it’s not.

public class BusyCursor : IDisposable
{
private Cursor _PreviousCursor;
private Form _ParentForm;

/// <summary>
/// Constructor
/// Saves form’s current cursor
/// Changes the cursor to a busy cursor
/// </summary>
/// <param name="ParentForm"></param>
public BusyCursor( Form ParentForm )
{
_ParentForm = ParentForm;
_PreviousCursor = _ParentForm.Cursor;
_ParentForm.Cursor = Cursors.WaitCursor;
}

#region IDisposable Members

public void Dispose()
{
_ParentForm.Cursor = _PreviousCursor;
}

#endregion
}

This class saves the cursor currently used by the form you pass to it. It then changes the cursor to the busy cursor. When it goes out of scope it restores the cursor to the original state.

You use the class like this:

public partial class Form1 : Form
{
public Form1()
{
InitializeComponent();
}

private void  button1_Click(object sender, EventArgs e)
{
using ( BusyCursor ShowBusyCursor = new BusyCursor(this) )
{
// long running process here
System.Threading.Thread.Sleep( 10000 );
}
}
}

It’s simple to implement, works very well, and isn’t terribly hard to understand.

Another day in the life of a workaday programmer batman.

++djs


When not to use threads, part 2

September 25, 2010

The lesson learned from the previous experiment was that threads (at least pthreads on Linux) aren’t lightweight. They take quite some time to create and start executing. If your task is of pretty short duration you’ll spend more time setting up the thread than you will actually doing work. They also take up considerable memory. You’ll want to be considerate of these things when designing something using threads.

For my logging example task there’s another design that might be superior. Instead of starting a thread for each bit of data to be logged I could create a single thread that would log data and sleep. If more data is logged it can be passed to the sleeping thread and written. This will eliminate the setup time for each data write and greatly reduce the amount of system resources used if many writes are done in succession.

Let’s test that. Here’s the code to implement a single thread version:


// queue for log data
std::queue< std::string > LogQueue;
pthread_cond_t myconvar;
BMutex QueueLock;

// thread function
void* QueueThread( void* arg )
{
 while ( true )
 {
 QueueLock.lock();
 pthread_cond_wait( &myconvar, &QueueLock._mutex );
 while ( ! LogQueue.empty() )
 {
 fwrite( LogQueue.front().c_str(), LogQueue.front().size(), 1, fd );
 LogQueue.pop();
 }
 QueueLock.unlock();
 }
}

{
 fd = fopen( "log4.txt", "wt" );

 pthread_setconcurrency( 2 );

 pthread_attr_t attr;
 pthread_attr_init( &attr );
 pthread_attr_setdetachstate( &attr, PTHREAD_CREATE_JOINABLE );
 pthread_attr_setstacksize ( &attr, 65536 );

 pthread_cond_init( &myconvar, NULL );

 pthread_t thread;
 int rc = pthread_create( &thread, &attr, QueueThread, NULL );

 clock_t start = clock();

 for ( int i = 0; i < tests; i++ )
 {
 QueueLock.lock();
 // queue the data
 LogQueue.push( sz );
 if ( LogQueue.size() % 4 == 0 )
 // signal the thread to write it
 pthread_cond_signal( &myconvar );
 QueueLock.unlock();
 }

 clock_t stop = clock();

 double d = ((double)(stop-start))/ CLOCKS_PER_SEC;
 printf( "%8.3f seconds (%d clocks)\n", d, (int)(stop-start) );

 // wait for queue to empty
 bool done;
 do
 {
 QueueLock.lock();
 done = LogQueue.empty();
 QueueLock.unlock();
 BSleep::sleep( 10 );
 } while ( ! done );

 pthread_cancel( thread );
 pthread_cond_destroy( &myconvar );
 pthread_attr_destroy( &attr );
 fclose( fd );
}

I’m using pthreads condition variables to communicate to the sleeping thread when new data is available.

How does this perform?

This design is much better than the previous one.

I changed the test to use 10,000,000 iterations this time because the difference between the two designs was much lower. I needed that many iteration to magnify any differences enough to make them visible.

Manual data write:                0.920 seconds (920000 clocks)
Threaded write:                   14.640 seconds (14640000 clocks)

This is many orders of magnitude better than the previous design but the threaded code is 15.9 times slower still! This pretty much proves that none of the threaded designs was in any way superior to just writing the data directly to a file. The logging frameworks in all my programs aren’t going to use threading!

Have a great day

++djs