Hi gurus,
After unexpected server shutdown HANA is not starting. Getting a little more deeper in the logs I´ve found the error listet bellow, it happens in nameserver startup process. Any hint?
hana001:/usr/sap/HDB/SYS/exe/hdb> ./hdbnameserver
service startup...
accepting requests at 127.0.0.1:30201; 127.0.0.2:30201
searching for master nameserver hana001:30201 ...
assign as master nameserver. assign to volume 1 started
checking for recovery request ...
loading topology ...
opening persistence ...
assign failed with persistence startup error. exception 1: no.3020046 (DataAccess/PageAccess/impl/PageImpl.cpp:399)
Wrong savepoint version: Expected 8373 but found 8361.
exception throw location:
1: 0x00007fb55f11bbbc in PageAccess::Page::verifyHeader(PageAccess::SizeClass, DataAccess::SavepointVersion const&, bool) const+0x3d8 at PageImpl.cpp:399 (libhdbdataaccess.so)
2: 0x00007fb55ee80da7 in DataAccess::SavepointImpl::loadRestartPage(PageAccess::PageIO&, unsigned long, bool, bool)+0x693 at Page.hpp:277 (libhdbdataaccess.so)
3: 0x00007fb55ee0205f in DataAccess::PersistenceManagerImpl::prepareOpen(unsigned long, bool, bool)+0x4b at PersistenceManagerImpl.cpp:5385 (libhdbdataaccess.so)
4: 0x00007fb55ee0223c in DataAccess::PersistenceManager::open(ltt::refcounted_handle<DataAccess::PersistenceConfiguration> const&, bool)+0x88 at PersistenceManagerImpl.cpp:2461 (libhdbdataaccess.so)
5: 0x00007fb56c510771 in PersistenceLayer::PersistenceSystem::initialize(NameServer::ServiceStartInfo const&, bool, PersistenceLayer::PERSISTENCE_MODE)+0x4a0 at PersistenceSystem.cpp:392 (libhdbpersistence.so)
6: 0x00007fb56c5437bc in PersistenceLayer::PersistenceFactory::initPersistence(PersistenceLayer::PERSISTENCE_MODE, ltt::releasable_handle<DataAccess::LoggerFactory>&, DataAccess::TransactionCallback*, NameServer::ServiceStartInfo&, ltt::refcounted_handle<TransactionManager::TransactionControlBlockFactory>&, bool)+0x158 at PersistenceFactory.cpp:394 (libhdbpersistence.so)
7: 0x00007fb56c43375a in PersistenceController::startup(PersistenceLayer::PERSISTENCE_MODE, NameServer::ServiceStartInfo*, bool, DataAccess::TablePreloadWriteCallback*, DataAccess::TablePreloadReadCallback*, Backup::RecoverCbc_Federation*)+0x1256 at PersistenceController.cpp:559 (libhdblogger.so)
8: 0x00007fb5751eedd5 in NameServer::Topology::initPersistence(NameServer::ServiceStartInfo&, bool, bool, TREX_ERROR::TRexError*, bool, ltt_adp::basic_string<char, ltt::char_traits<char>, ltt::integral_constant<bool, true> >, NameServer::ServiceStartInfo::RequestAction)+0x421 at Topology.cpp:285 (libhdbns.so)
9: 0x00007fb5752fb0c4 in NameServer::TREXNameServer::loadTopology(NameServer::LoadTopologyMode, NameServer::ServiceStartInfo&, Backup::Backup_ExtendedRecoveryInformation*)+0x5b0 at TREXNameServer.cpp:18396 (libhdbns.so)
10: 0x00007fb575305c23 in NameServer::TREXNameServer::assign(NameServer::ServiceStartInfo&)+0x2790 at TREXNameServer.cpp:2123 (libhdbns.so)
11: 0x00007fb5762d4f17 in TRexAPI::TREXIndexServer::assign(NameServer::ServiceStartInfo&, bool, TREX_ERROR::TRexError&)+0x93 at TREXIndexServer.cpp:992 (hdbnameserver)
12: 0x00007fb57630c129 in TRexAPI::AssignThread::run(void*)+0x35 at TREXIndexServer.cpp:543 (hdbnameserver)
13: 0x00007fb56cd03db5 in TrexThreads::PoolThread::run()+0x831 at PoolThread.cpp:389 (libhdbbasement.so)
14: 0x00007fb56cd056f0 in TrexThreads::PoolThread::run(void*&)+0x10 at PoolThread.cpp:165 (libhdbbasement.so)
15: 0x00007fb55471a9f0 in Execution::Thread::staticMainImp(void**)+0x700 at Thread.cpp:461 (libhdbbasis.so)
16: 0x00007fb55471bfc8 in Execution::Thread::staticMain(void*)+0x34 at ThreadMain.cpp:26 (libhdbbasis.so)
stopping service...
persistence initialization failed -> stopping instance ...
cannot send signal. (2, No such file or directory)
prepare for shutting service down...
stop ClockMonitor thread...
stop MasterTokenLockWriter /usr/sap/HDB/SYS/global//hdb/nameserver.lck thread...
setInactive(nameserver@hana001:30201)
hana001:/usr/sap/HDB/SYS/exe/hdb>
Also bellow HDB start log:
hana001:/usr/sap/HDB/HDB02> ./HDB start
StartService
Impromptu CCC initialization by 'rscpCInit'.
See SAP note 1266393.
OK
OK
Starting instance using: /usr/sap/HDB/SYS/exe/hdb/sapcontrol -prot NI_HTTP -nr 02 -function StartWait 2700 2
08.06.2016 03:44:24
Start
OK
08.06.2016 03:44:46
StartWait
FAIL: process hdbdaemon HDB Daemon not running
Final part from Daemon trace:
[17502]{-1}[-1/-1] 2016-06-08 03:44:28.545057 i Daemon | TrexDaemon.cpp(03513) : stdout = /usr/sap/HDB/HDB02/hana001/trace/xsuaaserver.out |
[17502]{-1}[-1/-1] 2016-06-08 03:44:28.545062 i Daemon | TrexDaemon.cpp(03513) : stderr = |
[17502]{-1}[-1/-1] 2016-06-08 03:44:28.545066 i Daemon | TrexDaemon.cpp(03513) : maxstdfiles = 2 |
[17502]{-1}[-1/-1] 2016-06-08 03:44:28.545071 i Daemon | TrexDaemon.cpp(03513) : runlevel = 4 |
[17502]{-1}[-1/-1] 2016-06-08 03:44:28.545075 i Daemon | TrexDaemon.cpp(03513) : flags = |
[17502]{-1}[-1/-1] 2016-06-08 03:44:28.545080 i Daemon | TrexDaemon.cpp(03513) : window = |
[17502]{-1}[-1/-1] 2016-06-08 03:44:28.545084 i Daemon | TrexDaemon.cpp(03513) : isolation = 0 |
[17502]{-1}[-1/-1] 2016-06-08 03:44:28.545088 i Daemon | TrexDaemon.cpp(03513) : userId = 4294967295 |
[17502]{-1}[-1/-1] 2016-06-08 03:44:28.545093 i Daemon | TrexDaemon.cpp(03513) : groupId = 4294967295 |
[17502]{-1}[-1/-1] 2016-06-08 03:44:28.545104 i Daemon | Daemon.cpp(00666) : runlevel 0 completely started |
[17502]{-1}[-1/-1] 2016-06-08 03:44:28.545115 i Daemon | Program.cpp(00173) : line up of program group mdcdispatcher to instances <none>. |
[17502]{-1}[-1/-1] 2016-06-08 03:44:28.545126 i Daemon | Program.cpp(00173) : line up of program group nameserver to instances 0. |
[17502]{-1}[-1/-1] 2016-06-08 03:44:28.545460 i Daemon | TrexDaemon.cpp(02078) : starting program at '/usr/sap/HDB/HDB02/exe/hdbnameserver' with args '' |
[17502]{-1}[-1/-1] 2016-06-08 03:44:28.569590 i Daemon | RunningInstance.cpp(00191) : start 'hdbnameserver' as process 17518 |
[17502]{-1}[-1/-1] 2016-06-08 03:44:35.313937 i Daemon | SignalsUNIX.cpp(00647) : signo 3 SIGQUIT from user errno 0 code 0 |
[17502]{-1}[-1/-1] 2016-06-08 03:44:35.314034 i Daemon | SignalsUNIX.cpp(00647) : sender pid 17518 real user id 1000 executable '/hana/shared/HDB/exe/linuxx86_64/HDB_1.00.112.02.1459851171_2840326/hdbnameserver' |
[17502]{-1}[-1/-1] 2016-06-08 03:44:35.314167 i Daemon | TrexDaemon.cpp(03558) : got shutdown event (stop) |
[17502]{-1}[-1/-1] 2016-06-08 03:44:35.314228 i Daemon | Daemon.cpp(00752) : comment file contains: nameserver: persistence initialization failed |
[17502]{-1}[-1/-1] 2016-06-08 03:44:35.314836 i Daemon | RunningInstance.cpp(00214) : stop process hdbnameserver with pid 17518 |
[17502]{-1}[-1/-1] 2016-06-08 03:44:35.315143 i Daemon | TrexDaemon.cpp(02598) : stopped child with pid 17518 (17518) |
[17502]{-1}[-1/-1] 2016-06-08 03:44:35.401872 i Daemon | TrexDaemon.cpp(03778) : process hdbnameserver with pid 17518 exited normally with status 1 |
[17502]{-1}[-1/-1] 2016-06-08 03:44:35.401939 i Daemon | TrexDaemon.cpp(03850) : all instances in runlevel 1 stopped |
[17502]{-1}[-1/-1] 2016-06-08 03:44:35.401952 i Daemon | Program.cpp(00173) : line up of program group dummyserver to instances <none>. |
[17502]{-1}[-1/-1] 2016-06-08 03:44:35.402492 i Daemon | ProgramStarter.cpp(00229) : writing started programs file /usr/sap/HDB/HDB02/hana001/lock/started_programs.txt: |
[17502]{-1}[-1/-1] 2016-06-08 03:44:35.528876 i Daemon | TrexDaemon.cpp(01616) : cleaning all |
[17502]{-1}[-1/-1] 2016-06-08 03:44:35.528931 i Daemon | TrexDaemon.cpp(02681) : found 8 segments in use with maximum id 8 |
[17502]{-1}[-1/-1] 2016-06-08 03:44:35.529017 i Daemon | TrexDaemon.cpp(02705) : segment 4, key 0x9f08cf81, id 3506180, sequence 107, owner 1000/1001, size 117, atime 2016-06-08 03:44:32, dtime 2016-06-08 03:44:35, ctime 2016-06-08 03:18:33, created by pid 17077, last changed by pid 17077, nattach 0 |
[17502]{-1}[-1/-1] 2016-06-08 03:44:35.529056 i Daemon | TrexDaemon.cpp(02705) : segment 5, key 0x00000000, id 3538949, sequence 108, owner 1000/1001, size 1060911, atime 2016-06-08 03:44:35, dtime 2016-06-08 03:44:35, ctime 2016-06-08 03:18:33, created by pid 17077, last changed by pid 17077, nattach 1 (deleted) |
[17502]{-1}[-1/-1] 2016-06-08 03:44:35.529069 i Daemon | TrexDaemon.cpp(02705) : segment 6, key 0x00000000, id 3571718, sequence 109, owner 1000/113, size 1024, atime 2016-06-08 03:44:26, dtime 2016-06-08 03:44:26, ctime 2016-06-08 03:44:25, created by pid 17494, last changed by pid 17494, nattach 1 |
[17502]{-1}[-1/-1] 2016-06-08 03:44:35.529157 i Daemon | Main.cpp(01530) : exit hdbdaemon |
[17502]{-1}[-1/-1] 2016-06-08 03:44:35.529171 i Daemon | Main.cpp(01541) : Success: /sapmnt/ld7272/a/HDB/jenkins_prod/workspace/FA_CO_LIN64GCC48HAPPY_rel_fa~newdb100_rel/sys/src/TrexDaemon/Main.cpp:1541 in "int main(int, char**)". Leaving the "main" |
[17502]{-1}[-1/-1] 2016-06-08 03:44:35.530121 i Network | NetworkListener.cpp(00805) : closing listen socket 7 bound to 127.0.0.1 |
[17502]{-1}[-1/-1] 2016-06-08 03:44:35.530283 i Network | NetworkListener.cpp(00805) : closing listen socket 8 bound to 127.0.0.2 |