Quantcast
Viewing all articles
Browse latest Browse all 387

dlee on "Intermittent TimeoutException on writes. Works after node stop/start"

We are using DataStax Enterprise trial. A system which was working nicely started to intermittently stop working / get timeout exceptions. This has gone on for several days without resolving itself.

Stack Trace:

Exception:
me.prettyprint.hector.api.exceptions.HTimedOutException: TimedOutException()
at me.prettyprint.cassandra.service.ExceptionsTranslatorImpl.translate(ExceptionsTranslatorImpl.java:42)
at me.prettyprint.cassandra.connection.HConnectionManager.operateWithFailover(HConnectionManager.java:260)
at me.prettyprint.cassandra.model.ExecutingKeyspace.doExecuteOperation(ExecutingKeyspace.java:113)
at me.prettyprint.cassandra.model.MutatorImpl.execute(MutatorImpl.java:243)

I read that a timeout is sometimes a "in progress" exception so i planned to trace my query and see if it was slow, however after restarting a single node in the cluster it started working again perfectly so it does not look like it was related to my query.

If a single node was providing timeouts shouldn't the forwarding nodes stop sending requests to that box? The replication factor was set to 1 so maybe they had no alternative?

There are some messages in the node logs:

I get this a lot:
HintedHandOffManager.java (line 374) Timed out replaying hints to /10.33.175.144; aborting further deliveries

And this:
CustomTThreadPoolServer.java (line 210) Error occurred during processing of message.
java.lang.RuntimeException: Failed to open server transport: unknown
at com.datastax.bdp.transport.server.TNegotiatingServerTransport$Factory.getTransport(TNegotiatingServerTransport.java:288)
at com.datastax.bdp.transport.server.TNegotiatingServerTransport$Factory.getTransport(TNegotiatingServerTransport.java:260)
at org.apache.cassandra.thrift.CustomTThreadPoolServer$WorkerProcess.run(CustomTThreadPoolServer.java:184)
at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:662)
Caused by: org.apache.thrift.transport.TTransportException: java.net.SocketException: Connection reset
at org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:129)
at com.datastax.bdp.transport.server.TPreviewableTransport.readUntilEof(TPreviewableTransport.java:79)
at com.datastax.bdp.transport.server.TPreviewableTransport.preview(TPreviewableTransport.java:55)
at com.datastax.bdp.transport.server.TNegotiatingServerTransport.open(TNegotiatingServerTransport.java:169)
at com.datastax.bdp.transport.server.TNegotiatingServerTransport$Factory.getTransport(TNegotiatingServerTransport.java:281)
... 5 more

Any ideas?
Thanks for any help


Viewing all articles
Browse latest Browse all 387

Trending Articles