Skip to content

Instantly share code, notes, and snippets.

@mohammadsalem
Last active September 7, 2019 17:48
Show Gist options
  • Save mohammadsalem/25f88912069f08350c707c7fe661d208 to your computer and use it in GitHub Desktop.
Save mohammadsalem/25f88912069f08350c707c7fe661d208 to your computer and use it in GitHub Desktop.
List of problems, solutions and tricks for DSpace
DSpace docker solr data volume for the first time
  • Build and create dspace container without the volume
  • Copy the solr data docker cp dspace:/dspace/solr solrData
  • Re-create dspace container with a volume ./solrData:/dspace/solr
DSpace docker solr data volume (where there is a solr data)
  • Build and create dspace container with the volume
  • Enter dspace container as root docker exec -it dspace bash
  • Change the owner of solr data chown -R dspace:dspace /dspace/solr
  • Exit and restart the container docker restart dspace
DSpace solr data merge
  • Copy the data to the solrData folder cp -r ../old_temp solrData/
  • Enter dspace container as root docker exec -it dspace bash
  • Change the owner of solr data chown -R dspace:dspace /dspace/solr
  • Exit and restart the container docker restart dspace
  • Enter dspace container as dspace user docker exec -it -u dspace dspace /bin/bash
  • Check if all the solr segments are ok and remove corrupted ones "This will remove corrupted segments!"
$ java -cp /usr/local/tomcat/webapps/solr/WEB-INF/lib/lucene-core-4.10.2.jar -ea:org.apache.lucene... org.apache.lucene.index.CheckIndex /dspace/solr/old_temp/statistics/data/index -fix
$ java -cp /usr/local/tomcat/webapps/solr/WEB-INF/lib/lucene-core-4.10.2.jar -ea:org.apache.lucene... org.apache.lucene.index.CheckIndex /dspace/solr/old_temp/oai/data/index -fix
$ java -cp /usr/local/tomcat/webapps/solr/WEB-INF/lib/lucene-core-4.10.2.jar -ea:org.apache.lucene... org.apache.lucene.index.CheckIndex /dspace/solr/old_temp/search/data/index -fix
$ java -cp /usr/local/tomcat/webapps/solr/WEB-INF/lib/lucene-core-4.10.2.jar -ea:org.apache.lucene... org.apache.lucene.index.CheckIndex /dspace/solr/old_temp/authority/data/index -fix
  • Copy the current solr data to a temp directory
$ mkdir /dspace/solr/new_temp && cp -r /dspace/solr/statistics /dspace/solr/oai /dspace/solr/search /dspace/solr/authority /dspace/solr/new_temp/
  • Copy the current solr data to a temp directory
$ mkdir /dspace/solr/new_temp && cp -r /dspace/solr/statistics /dspace/solr/oai /dspace/solr/search /dspace/solr/authority /dspace/solr/new_temp/
  • Remove the data from the original path and merge solr data
$ rm -rf dspace/solr/statistics/data/index/* && java -cp /usr/local/tomcat/webapps/solr/WEB-INF/lib/lucene-core-4.10.2.jar:/usr/local/tomcat/webapps/solr/WEB-INF/lib/lucene-misc-4.10.2.jar org/apache/lucene/misc/IndexMergeTool /dspace/solr/statistics/data/index /dspace/solr/new_temp/statistics/data/index /dspace/solr/old_temp/statistics/data/index
$ rm -rf dspace/solr/oai/data/index/* && java -cp /usr/local/tomcat/webapps/solr/WEB-INF/lib/lucene-core-4.10.2.jar:/usr/local/tomcat/webapps/solr/WEB-INF/lib/lucene-misc-4.10.2.jar org/apache/lucene/misc/IndexMergeTool /dspace/solr/oai/data/index /dspace/solr/new_temp/oai/data/index /dspace/solr/old_temp/oai/data/index
$ rm -rf dspace/solr/search/data/index/* && java -cp /usr/local/tomcat/webapps/solr/WEB-INF/lib/lucene-core-4.10.2.jar:/usr/local/tomcat/webapps/solr/WEB-INF/lib/lucene-misc-4.10.2.jar org/apache/lucene/misc/IndexMergeTool /dspace/solr/search/data/index /dspace/solr/new_temp/search/data/index /dspace/solr/old_temp/search/data/index
$ rm -rf dspace/solr/authority/data/index/* && java -cp /usr/local/tomcat/webapps/solr/WEB-INF/lib/lucene-core-4.10.2.jar:/usr/local/tomcat/webapps/solr/WEB-INF/lib/lucene-misc-4.10.2.jar org/apache/lucene/misc/IndexMergeTool /dspace/solr/authority/data/index /dspace/solr/new_temp/authority/data/index /dspace/solr/old_temp/authority/data/index
  • Run the following tasks to clean and index solr data
$ /dspace/bin/dspace oai import -o
$ /dspace/bin/dspace index-discovery
$ /dspace/bin/dspace index-discovery -o
$ /dspace/bin/dspace index-authority
$ /dspace/bin/dspace stats-util -i
$ /dspace/bin/dspace stats-util -o
$ /dspace/bin/dspace sub-daily
$ /dspace/bin/dspace filter-media
  • Exit and restart the container docker restart dspace
Script to import DSpace database IMPORT_DSPACE_DB.sh
#!/bin/bash

echo "$(tput setaf 6)ARE YOU SURE? <yes/no>"$(tput sgr 0)
read sure
if [ $sure != 'yes' ]; then
   exit
fi

echo "$(tput setaf 6)AGAIN ARE YOU SURE? <yes/no>"$(tput sgr 0)
read sureagain
if [ $sureagain != 'yes' ]; then
   exit
fi

cd ./db_backup

echo "$(tput setaf 6)mEnter database number <<default:Last downloaded database>>"$(tput sgr 0)
read num

if [ ! -n "$num" ]; then
    num=`ls -t | awk '{printf("%s",$0);exit}' | tr -d '[:alpha:]\-\.'`
fi

if [ -f "dspace-$num.dump" ]; then
   echo "$(tput setaf 5)File dspace-$num.dump  exist."$(tput sgr 0)
elif [ -f "dspace-$num.dump.tar.gz" ]; then
   echo "$(tput setaf 5)Extracting the file..."$(tput sgr 0)
   tar -xvzf dspace-$num.dump.tar.gz
else
   echo "$(tput setaf 5)File not found"$(tput sgr 0)
   exit
fi

echo "$(tput setaf 6)Stopping dspace container..."$(tput sgr 0)
docker stop dspace

echo "$(tput setaf 6)Droping database..."$(tput sgr 0)
docker exec dspace_db dropdb -U postgres dspace

echo "$(tput setaf 6)Creating database..."$(tput sgr 0)
docker exec dspace_db createdb -U postgres -O dspace --encoding=UNICODE dspace

echo "$(tput setaf 6)Creating dspace user..."$(tput sgr 0)
docker exec dspace_db psql -U postgres dspace -c 'alter user dspace createuser;'

echo "$(tput setaf 6)Copying database..."$(tput sgr 0)
docker cp dspace-$num.dump dspace_db:/

echo "$(tput setaf 6)Importing database..."$(tput sgr 0)
docker exec dspace_db pg_restore -U postgres -d dspace /dspace-$num.dump

echo "$(tput setaf 6)Removing dspace user..."$(tput sgr 0)
docker exec dspace_db psql -U postgres dspace -c 'alter user dspace nocreateuser;'

echo "$(tput setaf 6)Vacum database..."$(tput sgr 0)
docker exec dspace_db vacuumdb -U postgres dspace

echo "$(tput setaf 6)Updating sequences..."$(tput sgr 0)
docker cp dspace:/dspace/etc/postgres/update-sequences.sql .
docker cp update-sequences.sql dspace_db:/
docker exec dspace_db psql -U dspace -f /update-sequences.sql dspace

echo "$(tput setaf 6)Cleaning up..."$(tput sgr 0)
docker exec -it dspace_db bash -c "rm dspace-$num.dump"
docker exec -it dspace_db bash -c "rm update-sequences.sql"
rm dspace-$num.dump

echo "$(tput setaf 6)Starting dspace container..."$(tput sgr 0)
docker start dspace

echo "$(tput setaf 6)Finish"$(tput sgr 0)
Script to create DSpace database backup IMPORT_DSPACE_DB.sh
#!/bin/bash
db_name="dspace-$(date +%s).dump"

echo "$(tput setaf 6)Creating full backup $db_name.tar.gz ..."$(tput sgr 0)

docker exec -i dspace_db pg_dump -U dspace -Fc -f /$db_name dspace
cd ./db_backup/
docker cp dspace_db:/$db_name .
tar -czvf $db_name.tar.gz $db_name

echo "$(tput setaf 6)Cleaning up..."$(tput sgr 0)
docker exec -it dspace_db bash -c "rm /$db_name"
rm $db_name
echo "$(tput setaf 6)Backup finished."$(tput sgr 0)
Hide Item Metadata Fields

Fields named here are hidden in the following places UNLESS the logged-in user is an Administrator:

  1. XMLUI metadata XML view, and Item splash pages (long and short views).
  2. JSPUI Item splash pages
  • Add a property in /dspace/config/dspace.cfg in the form: metadata.hide.SCHEMA.ELEMENT.QUALIFIER = true e.g. metadata.hide.mel.partner.id = true
Solr view statistics
  • https://alanorth.github.io/cgspace-notes/2019-04/ search for "Holy shit".
  • Insure that the dbfile = /dspace/config/GeoLite2-City.mmdb defined in /dspace/config/modules/usage-statistics.cfg exists, otherwise it wont record the hits.
  • Insure hits get to DSpace container with the client IP address, in my case in nginx changing this proxy_set_header XForwardedFor $proxy_add_x_forwarded_for; to proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for; did it.
DSpace logs compress
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment