diff options
31 files changed, 1867 insertions, 274 deletions
@@ -24,11 +24,12 @@ 1. Rudin-Osher-Fatemi (ROF) Total Variation (explicit PDE minimisation scheme) **2D/3D CPU/GPU** (Ref. *1*) 2. Fast-Gradient-Projection (FGP) Total Variation **2D/3D CPU/GPU** (Ref. *2*) 3. Split-Bregman (SB) Total Variation **2D/3D CPU/GPU** (Ref. *5*) -4. Total Generalised Variation (TGV) model for higher-order regularisation **2D/3D CPU/GPU** (Ref. *6*) -5. Linear and nonlinear diffusion (explicit PDE minimisation scheme) **2D/3D CPU/GPU** (Ref. *8*) -6. Anisotropic Fourth-Order Diffusion (explicit PDE minimisation) **2D/3D CPU/GPU** (Ref. *9*) -7. A joint ROF-LLT (Lysaker-Lundervold-Tai) model for higher-order regularisation **2D/3D CPU/GPU** (Ref. *10,11*) -8. Nonlocal Total Variation regularisation (GS fixed point iteration) **2D CPU/GPU** (Ref. *12*) +4. Primal-Dual (PD) Total Variation **2D/3D CPU/GPU** (Ref. *13*) +5. Total Generalised Variation (TGV) model for higher-order regularisation **2D/3D CPU/GPU** (Ref. *6,13*) +6. Linear and nonlinear diffusion (explicit PDE minimisation scheme) **2D/3D CPU/GPU** (Ref. *8*) +7. Anisotropic Fourth-Order Diffusion (explicit PDE minimisation) **2D/3D CPU/GPU** (Ref. *9*) +8. A joint ROF-LLT (Lysaker-Lundervold-Tai) model for higher-order regularisation **2D/3D CPU/GPU** (Ref. *10,11*) +9. Nonlocal Total Variation regularisation (GS fixed point iteration) **2D CPU/GPU** (Ref. *12*) ### Multi-channel (vectorial): 1. Fast-Gradient-Projection (FGP) Directional Total Variation **2D/3D CPU/GPU** (Ref. *3,4,2*) @@ -145,29 +146,32 @@ addpath(/path/to/library); ``` ### References to implemented methods: -1. [Rudin, L.I., Osher, S. and Fatemi, E., 1992. Nonlinear total variation based noise removal algorithms. Physica D: nonlinear phenomena, 60(1-4), pp.259-268.](https://www.sciencedirect.com/science/article/pii/016727899290242F) +1. [Rudin, L.I., Osher, S. and Fatemi, E., 1992. Nonlinear total variation based noise removal algorithms. Physica D: nonlinear phenomena, 60(1-4)](https://www.sciencedirect.com/science/article/pii/016727899290242F) -2. [Beck, A. and Teboulle, M., 2009. Fast gradient-based algorithms for constrained total variation image denoising and deblurring problems. IEEE Transactions on Image Processing, 18(11), pp.2419-2434.](https://doi.org/10.1109/TIP.2009.2028250) +2. [Beck, A. and Teboulle, M., 2009. Fast gradient-based algorithms for constrained total variation image denoising and deblurring problems. IEEE Transactions on Image Processing, 18(11)](https://doi.org/10.1109/TIP.2009.2028250) -3. [Ehrhardt, M.J. and Betcke, M.M., 2016. Multicontrast MRI reconstruction with structure-guided total variation. SIAM Journal on Imaging Sciences, 9(3), pp.1084-1106.](https://doi.org/10.1137/15M1047325) +3. [Ehrhardt, M.J. and Betcke, M.M., 2016. Multicontrast MRI reconstruction with structure-guided total variation. SIAM Journal on Imaging Sciences, 9(3)](https://doi.org/10.1137/15M1047325) 4. [Kazantsev, D., Jørgensen, J.S., Andersen, M., Lionheart, W.R., Lee, P.D. and Withers, P.J., 2018. Joint image reconstruction method with correlative multi-channel prior for X-ray spectral computed tomography. Inverse Problems, 34(6)](https://doi.org/10.1088/1361-6420/aaba86) **Results can be reproduced using the following** [SOFTWARE](https://github.com/dkazanc/multi-channel-X-ray-CT) -5. [Goldstein, T. and Osher, S., 2009. The split Bregman method for L1-regularized problems. SIAM journal on imaging sciences, 2(2), pp.323-343.](https://doi.org/10.1137/080725891) +5. [Goldstein, T. and Osher, S., 2009. The split Bregman method for L1-regularized problems. SIAM journal on imaging sciences, 2(2)](https://doi.org/10.1137/080725891) -6. [Bredies, K., Kunisch, K. and Pock, T., 2010. Total generalized variation. SIAM Journal on Imaging Sciences, 3(3), pp.492-526.](https://doi.org/10.1137/090769521) +6. [Bredies, K., Kunisch, K. and Pock, T., 2010. Total generalized variation. SIAM Journal on Imaging Sciences, 3(3)](https://doi.org/10.1137/090769521) -7. [Duran, J., Moeller, M., Sbert, C. and Cremers, D., 2016. Collaborative total variation: a general framework for vectorial TV models. SIAM Journal on Imaging Sciences, 9(1), pp.116-151.](https://doi.org/10.1137/15M102873X) +7. [Duran, J., Moeller, M., Sbert, C. and Cremers, D., 2016. Collaborative total variation: a general framework for vectorial TV models. SIAM Journal on Imaging Sciences, 9(1)](https://doi.org/10.1137/15M102873X) -8. [Black, M.J., Sapiro, G., Marimont, D.H. and Heeger, D., 1998. Robust anisotropic diffusion. IEEE Transactions on image processing, 7(3), pp.421-432.](https://doi.org/10.1109/83.661192) +8. [Black, M.J., Sapiro, G., Marimont, D.H. and Heeger, D., 1998. Robust anisotropic diffusion. IEEE Transactions on image processing, 7(3)](https://doi.org/10.1109/83.661192) -9. [Hajiaboli, M.R., 2011. An anisotropic fourth-order diffusion filter for image noise removal. International Journal of Computer Vision, 92(2), pp.177-191.](https://doi.org/10.1007/s11263-010-0330-1) +9. [Hajiaboli, M.R., 2011. An anisotropic fourth-order diffusion filter for image noise removal. International Journal of Computer Vision, 92(2)](https://doi.org/10.1007/s11263-010-0330-1) -10. [Lysaker, M., Lundervold, A. and Tai, X.C., 2003. Noise removal using fourth-order partial differential equation with applications to medical magnetic resonance images in space and time. IEEE Transactions on image processing, 12(12), pp.1579-1590.](https://doi.org/10.1109/TIP.2003.819229) +10. [Lysaker, M., Lundervold, A. and Tai, X.C., 2003. Noise removal using fourth-order partial differential equation with applications to medical magnetic resonance images in space and time. IEEE Transactions on image processing, 12(12)](https://doi.org/10.1109/TIP.2003.819229) -11. [Kazantsev, D., Guo, E., Phillion, A.B., Withers, P.J. and Lee, P.D., 2017. Model-based iterative reconstruction using higher-order regularization of dynamic synchrotron data. Measurement Science and Technology, 28(9), p.094004.](https://doi.org/10.1088/1361-6501/aa7fa8) +11. [Kazantsev, D., Guo, E., Phillion, A.B., Withers, P.J. and Lee, P.D., 2017. Model-based iterative reconstruction using higher-order regularization of dynamic synchrotron data. Measurement Science and Technology, 28(9)](https://doi.org/10.1088/1361-6501/aa7fa8) + +12. [Abderrahim E., Lezoray O. and Bougleux S. 2008. Nonlocal discrete regularization on weighted graphs: a framework for image and manifold processing. IEEE Trans. Image Processing 17(7), pp. 1047-1060.](https://ieeexplore.ieee.org/document/4526700) + +13. [Chambolle, A. and Pock, T., 2010. A first-order primal-dual algorithm for convex problems with applications to imaging. Journal of mathematical imaging and vision 40(1)](https://doi.org/10.1007/s10851-010-0251-1) -12. [Abderrahim E., Lezoray O. and Bougleux S. 2008. Nonlocal discrete regularization on weighted graphs: a framework for image and manifold processing." IEEE Trans. Image Processing 17(7), pp. 1047-1060.](https://ieeexplore.ieee.org/document/4526700) ### References to software (please cite if used): * [Kazantsev, D., Pasca, E., Turner, M.J. and Withers, P.J., 2019. CCPi-Regularisation toolkit for computed tomographic image reconstruction with proximal splitting algorithms. SoftwareX, 9, pp.317-323.](https://www.sciencedirect.com/science/article/pii/S2352711018301912) diff --git a/build/run.sh b/build/run.sh index 285476b..f0b3d3e 100755 --- a/build/run.sh +++ b/build/run.sh @@ -6,7 +6,7 @@ rm -r ../build_proj mkdir ../build_proj cd ../build_proj/ #make clean -export CIL_VERSION=19.09 +export CIL_VERSION=19.10 # install Python modules without CUDA #cmake ../ -DBUILD_PYTHON_WRAPPER=ON -DBUILD_MATLAB_WRAPPER=OFF -DBUILD_CUDA=OFF -DCMAKE_BUILD_TYPE=Release -DCMAKE_INSTALL_PREFIX=./install # install Python modules with CUDA diff --git a/demos/Matlab_demos/demoMatlab_3Ddenoise.m b/demos/Matlab_demos/demoMatlab_3Ddenoise.m index f018327..b7f92cb 100644 --- a/demos/Matlab_demos/demoMatlab_3Ddenoise.m +++ b/demos/Matlab_demos/demoMatlab_3Ddenoise.m @@ -62,6 +62,25 @@ fprintf('Denoise a volume using the FGP-TV model (GPU) \n'); % fprintf('%s %f \n', 'RMSE error for FGP-TV is:', rmse_fgpG); % figure; imshow(u_fgpG(:,:,7), [0 1]); title('FGP-TV denoised volume (GPU)'); %% +fprintf('Denoise a volume using the PD-TV model (CPU) \n'); +lambda_reg = 0.03; % regularsation parameter for all methods +iter_pd = 300; % number of FGP iterations +epsil_tol = 0.0; % tolerance +tic; [u_pd,infovec] = PD_TV(single(vol3D), lambda_reg, iter_pd, epsil_tol); toc; +energyfunc_val_fgp = TV_energy(single(u_pd),single(vol3D),lambda_reg, 1); % get energy function value +rmse_pd = (RMSE(Ideal3D(:),u_pd(:))); +fprintf('%s %f \n', 'RMSE error for PD-TV is:', rmse_pd); +figure; imshow(u_pd(:,:,7), [0 1]); title('PD-TV denoised volume (CPU)'); +%% +% fprintf('Denoise a volume using the PD-TV model (GPU) \n'); +% lambda_reg = 0.03; % regularsation parameter for all methods +% iter_pd = 300; % number of FGP iterations +% epsil_tol = 0.0; % tolerance +% tic; u_pdG = PD_TV_GPU(single(vol3D), lambda_reg, iter_pd, epsil_tol); toc; +% rmse_pdG = (RMSE(Ideal3D(:),u_pdG(:))); +% fprintf('%s %f \n', 'RMSE error for PD-TV is:', rmse_pdG); +% figure; imshow(u_pdG(:,:,7), [0 1]); title('PD-TV denoised volume (GPU)'); +%% fprintf('Denoise a volume using the SB-TV model (CPU) \n'); iter_sb = 150; % number of SB iterations epsil_tol = 0.0; % tolerance diff --git a/demos/Matlab_demos/demoMatlab_denoise.m b/demos/Matlab_demos/demoMatlab_denoise.m index b50eaf5..3d93cb6 100644 --- a/demos/Matlab_demos/demoMatlab_denoise.m +++ b/demos/Matlab_demos/demoMatlab_denoise.m @@ -46,6 +46,22 @@ figure; imshow(u_fgp, [0 1]); title('FGP-TV denoised image (CPU)'); % tic; u_fgpG = FGP_TV_GPU(single(u0), lambda_reg, iter_fgp, epsil_tol); toc; % figure; imshow(u_fgpG, [0 1]); title('FGP-TV denoised image (GPU)'); %% +fprintf('Denoise using the PD-TV model (CPU) \n'); +lambda_reg = 0.03; +iter_pd = 500; % number of FGP iterations +epsil_tol = 0.0; % tolerance +tic; [u_pd,infovec] = PD_TV(single(u0), lambda_reg, iter_pd, epsil_tol); toc; +energyfunc_val_pd = TV_energy(single(u_pd),single(u0),lambda_reg, 1); % get energy function value +rmsePD = (RMSE(u_pd(:),Im(:))); +fprintf('%s %f \n', 'RMSE error for PD-TV is:', rmsePD); +[ssimval] = ssim(u_pd*255,single(Im)*255); +fprintf('%s %f \n', 'MSSIM error for PD-TV is:', ssimval); +figure; imshow(u_pd, [0 1]); title('PD-TV denoised image (CPU)'); +%% +% fprintf('Denoise using the PD-TV model (GPU) \n'); +% tic; u_pdG = PD_TV_GPU(single(u0), lambda_reg, iter_pd, epsil_tol); toc; +% figure; imshow(u_pdG, [0 1]); title('PD-TV denoised image (GPU)'); +%% fprintf('Denoise using the SB-TV model (CPU) \n'); lambda_reg = 0.03; iter_sb = 200; % number of SB iterations diff --git a/demos/demo_cpu_regularisers.py b/demos/demo_cpu_regularisers.py index 8655623..7d66b7f 100644 --- a/demos/demo_cpu_regularisers.py +++ b/demos/demo_cpu_regularisers.py @@ -12,7 +12,7 @@ import matplotlib.pyplot as plt import numpy as np import os import timeit -from ccpi.filters.regularisers import ROF_TV, FGP_TV, SB_TV, TGV, LLT_ROF, FGP_dTV, TNV, NDF, Diff4th +from ccpi.filters.regularisers import ROF_TV, FGP_TV, PD_TV, SB_TV, TGV, LLT_ROF, FGP_dTV, TNV, NDF, Diff4th from ccpi.filters.regularisers import PatchSelect, NLTV from ccpi.supp.qualitymetrics import QualityTools ############################################################################### @@ -129,8 +129,8 @@ imgplot = plt.imshow(u0,cmap="gray") pars = {'algorithm' : FGP_TV, \ 'input' : u0,\ 'regularisation_parameter':0.02, \ - 'number_of_iterations' :400 ,\ - 'tolerance_constant':1e-06,\ + 'number_of_iterations' :1500 ,\ + 'tolerance_constant':1e-08,\ 'methodTV': 0 ,\ 'nonneg': 0} @@ -161,6 +161,55 @@ plt.title('{}'.format('CPU results')) #%% print ("%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%") +print ("_______________PD-TV (2D)__________________") +print ("%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%") + +## plot +fig = plt.figure() +plt.suptitle('Performance of PD-TV regulariser using the CPU') +a=fig.add_subplot(1,2,1) +a.set_title('Noisy Image') +imgplot = plt.imshow(u0,cmap="gray") + +# set parameters +pars = {'algorithm' : PD_TV, \ + 'input' : u0,\ + 'regularisation_parameter':0.02, \ + 'number_of_iterations' :1500 ,\ + 'tolerance_constant':1e-08,\ + 'methodTV': 0 ,\ + 'nonneg': 1, + 'lipschitz_const' : 8, + 'tau' : 0.0025} + +print ("#############PD TV CPU####################") +start_time = timeit.default_timer() +(pd_cpu,info_vec_cpu) = PD_TV(pars['input'], + pars['regularisation_parameter'], + pars['number_of_iterations'], + pars['tolerance_constant'], + pars['methodTV'], + pars['nonneg'], + pars['lipschitz_const'], + pars['tau'],'cpu') + +Qtools = QualityTools(Im, pd_cpu) +pars['rmse'] = Qtools.rmse() + +txtstr = printParametersToString(pars) +txtstr += "%s = %.3fs" % ('elapsed time',timeit.default_timer() - start_time) +print (txtstr) +a=fig.add_subplot(1,2,2) + +# these are matplotlib.patch.Patch properties +props = dict(boxstyle='round', facecolor='wheat', alpha=0.75) +# place a text box in upper left in axes coords +a.text(0.15, 0.25, txtstr, transform=a.transAxes, fontsize=14, + verticalalignment='top', bbox=props) +imgplot = plt.imshow(pd_cpu, cmap="gray") +plt.title('{}'.format('CPU results')) +#%% +print ("%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%") print ("_______________SB-TV (2D)__________________") print ("%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%") diff --git a/demos/demo_cpu_regularisers3D.py b/demos/demo_cpu_regularisers3D.py index fc1e8e6..cfdd2d4 100644 --- a/demos/demo_cpu_regularisers3D.py +++ b/demos/demo_cpu_regularisers3D.py @@ -12,7 +12,7 @@ import matplotlib.pyplot as plt import numpy as np import os import timeit -from ccpi.filters.regularisers import ROF_TV, FGP_TV, SB_TV, TGV, LLT_ROF, FGP_dTV, NDF, Diff4th +from ccpi.filters.regularisers import ROF_TV, FGP_TV, PD_TV, SB_TV, TGV, LLT_ROF, FGP_dTV, NDF, Diff4th from ccpi.supp.qualitymetrics import QualityTools ############################################################################### def printParametersToString(pars): @@ -30,8 +30,7 @@ def printParametersToString(pars): return txt ############################################################################### -# filename = os.path.join( "data" ,"lena_gray_512.tif") -filename = "/home/algol/Documents/DEV/CCPi-Regularisation-Toolkit/test/lena_gray_512.tif" +filename = os.path.join( "data" ,"lena_gray_512.tif") # read image Im = plt.imread(filename) @@ -169,7 +168,55 @@ a.text(0.15, 0.25, txtstr, transform=a.transAxes, fontsize=14, verticalalignment='top', bbox=props) imgplot = plt.imshow(fgp_cpu3D[10,:,:], cmap="gray") plt.title('{}'.format('Recovered volume on the CPU using FGP-TV')) +#%% +print ("%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%") +print ("_______________PD-TV (3D)__________________") +print ("%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%") + +## plot +fig = plt.figure() +plt.suptitle('Performance of PD-TV regulariser using the CPU') +a=fig.add_subplot(1,2,1) +a.set_title('Noisy Image') +imgplot = plt.imshow(noisyVol[10,:,:],cmap="gray") +# set parameters +pars = {'algorithm' : PD_TV, \ + 'input' : noisyVol,\ + 'regularisation_parameter':0.02, \ + 'number_of_iterations' :1000 ,\ + 'tolerance_constant':1e-06,\ + 'methodTV': 0 ,\ + 'nonneg': 0, + 'lipschitz_const' : 8, + 'tau' : 0.0025} + +print ("#############FGP TV GPU####################") +start_time = timeit.default_timer() +(pd_cpu3D,info_vec_cpu) = PD_TV(pars['input'], + pars['regularisation_parameter'], + pars['number_of_iterations'], + pars['tolerance_constant'], + pars['methodTV'], + pars['nonneg'], + pars['lipschitz_const'], + pars['tau'],'cpu') + +Qtools = QualityTools(idealVol, pd_cpu3D) +pars['rmse'] = Qtools.rmse() + +txtstr = printParametersToString(pars) +txtstr += "%s = %.3fs" % ('elapsed time',timeit.default_timer() - start_time) +print (txtstr) +a=fig.add_subplot(1,2,2) + +# these are matplotlib.patch.Patch properties +props = dict(boxstyle='round', facecolor='wheat', alpha=0.75) +# place a text box in upper left in axes coords +a.text(0.15, 0.25, txtstr, transform=a.transAxes, fontsize=14, + verticalalignment='top', bbox=props) +imgplot = plt.imshow(pd_cpu3D[10,:,:], cmap="gray") +plt.title('{}'.format('Recovered volume on the CPU using PD-TV')) #%% print ("%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%") print ("_______________SB-TV (3D)_________________") diff --git a/demos/demo_cpu_vs_gpu_regularisers.py b/demos/demo_cpu_vs_gpu_regularisers.py index 21e3899..015dfc6 100644 --- a/demos/demo_cpu_vs_gpu_regularisers.py +++ b/demos/demo_cpu_vs_gpu_regularisers.py @@ -12,7 +12,7 @@ import matplotlib.pyplot as plt import numpy as np import os import timeit -from ccpi.filters.regularisers import ROF_TV, FGP_TV, SB_TV, TGV, LLT_ROF, FGP_dTV, NDF, Diff4th +from ccpi.filters.regularisers import ROF_TV, FGP_TV, PD_TV, SB_TV, TGV, LLT_ROF, FGP_dTV, NDF, Diff4th from ccpi.filters.regularisers import PatchSelect from ccpi.supp.qualitymetrics import QualityTools ############################################################################### @@ -220,6 +220,108 @@ if (diff_im.sum() > 1): else: print ("Arrays match") + +#%% +print ("%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%") +print ("____________PD-TV bench___________________") +print ("%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%") + +## plot +fig = plt.figure() +plt.suptitle('Comparison of PD-TV regulariser using CPU and GPU implementations') +a=fig.add_subplot(1,4,1) +a.set_title('Noisy Image') +imgplot = plt.imshow(u0,cmap="gray") + +# set parameters +pars = {'algorithm' : PD_TV, \ + 'input' : u0,\ + 'regularisation_parameter':0.02, \ + 'number_of_iterations' :1500 ,\ + 'tolerance_constant':0.0,\ + 'methodTV': 0 ,\ + 'nonneg': 0, + 'lipschitz_const' : 8, + 'tau' : 0.0025} + +print ("#############PD TV CPU####################") +start_time = timeit.default_timer() +(pd_cpu,info_vec_cpu) = PD_TV(pars['input'], + pars['regularisation_parameter'], + pars['number_of_iterations'], + pars['tolerance_constant'], + pars['methodTV'], + pars['nonneg'], + pars['lipschitz_const'], + pars['tau'],'cpu') + +Qtools = QualityTools(Im, pd_cpu) +pars['rmse'] = Qtools.rmse() + +txtstr = printParametersToString(pars) +txtstr += "%s = %.3fs" % ('elapsed time',timeit.default_timer() - start_time) +print (txtstr) +a=fig.add_subplot(1,4,2) + +# these are matplotlib.patch.Patch properties +props = dict(boxstyle='round', facecolor='wheat', alpha=0.75) +# place a text box in upper left in axes coords +a.text(0.15, 0.25, txtstr, transform=a.transAxes, fontsize=14, + verticalalignment='top', bbox=props) +imgplot = plt.imshow(pd_cpu, cmap="gray") +plt.title('{}'.format('CPU results')) + +# set parameters +pars = {'algorithm' : PD_TV, \ + 'input' : u0,\ + 'regularisation_parameter':0.02, \ + 'number_of_iterations' :1500 ,\ + 'tolerance_constant':0.0,\ + 'methodTV': 0 ,\ + 'nonneg': 0, + 'lipschitz_const' : 8, + 'tau' : 0.0025} + +print ("#############PD TV CPU####################") +start_time = timeit.default_timer() +(pd_gpu,info_vec_gpu) = PD_TV(pars['input'], + pars['regularisation_parameter'], + pars['number_of_iterations'], + pars['tolerance_constant'], + pars['methodTV'], + pars['nonneg'], + pars['lipschitz_const'], + pars['tau'],'gpu') + +Qtools = QualityTools(Im, pd_gpu) +pars['rmse'] = Qtools.rmse() + +txtstr = printParametersToString(pars) +txtstr += "%s = %.3fs" % ('elapsed time',timeit.default_timer() - start_time) +print (txtstr) +a=fig.add_subplot(1,4,3) + +# these are matplotlib.patch.Patch properties +props = dict(boxstyle='round', facecolor='wheat', alpha=0.75) +# place a text box in upper left in axes coords +a.text(0.15, 0.25, txtstr, transform=a.transAxes, fontsize=14, + verticalalignment='top', bbox=props) +imgplot = plt.imshow(pd_gpu, cmap="gray") +plt.title('{}'.format('GPU results')) + + +print ("--------Compare the results--------") +tolerance = 1e-05 +diff_im = np.zeros(np.shape(pd_cpu)) +diff_im = abs(pd_cpu - pd_gpu) +diff_im[diff_im > tolerance] = 1 +a=fig.add_subplot(1,4,4) +imgplot = plt.imshow(diff_im, vmin=0, vmax=1, cmap="gray") +plt.title('{}'.format('Pixels larger threshold difference')) +if (diff_im.sum() > 1): + print ("Arrays do not match!") +else: + print ("Arrays match") #%% print ("%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%") print ("____________SB-TV bench___________________") diff --git a/demos/demo_gpu_regularisers.py b/demos/demo_gpu_regularisers.py index 3efcfce..5131847 100644 --- a/demos/demo_gpu_regularisers.py +++ b/demos/demo_gpu_regularisers.py @@ -12,7 +12,7 @@ import matplotlib.pyplot as plt import numpy as np import os import timeit -from ccpi.filters.regularisers import ROF_TV, FGP_TV, SB_TV, TGV, LLT_ROF, FGP_dTV, NDF, Diff4th +from ccpi.filters.regularisers import ROF_TV, FGP_TV, PD_TV, SB_TV, TGV, LLT_ROF, FGP_dTV, NDF, Diff4th from ccpi.filters.regularisers import PatchSelect, NLTV from ccpi.supp.qualitymetrics import QualityTools ############################################################################### @@ -158,6 +158,55 @@ imgplot = plt.imshow(fgp_gpu, cmap="gray") plt.title('{}'.format('GPU results')) #%% print ("%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%") +print ("_______________PD-TV (2D)__________________") +print ("%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%") + +## plot +fig = plt.figure() +plt.suptitle('Performance of PD-TV regulariser using the GPU') +a=fig.add_subplot(1,2,1) +a.set_title('Noisy Image') +imgplot = plt.imshow(u0,cmap="gray") + +# set parameters +pars = {'algorithm' : PD_TV, \ + 'input' : u0,\ + 'regularisation_parameter':0.02, \ + 'number_of_iterations' :1500 ,\ + 'tolerance_constant':1e-06,\ + 'methodTV': 0 ,\ + 'nonneg': 1, + 'lipschitz_const' : 8, + 'tau' : 0.0025} + +print ("#############PD TV CPU####################") +start_time = timeit.default_timer() +(pd_gpu,info_vec_gpu) = PD_TV(pars['input'], + pars['regularisation_parameter'], + pars['number_of_iterations'], + pars['tolerance_constant'], + pars['methodTV'], + pars['nonneg'], + pars['lipschitz_const'], + pars['tau'],'gpu') + +Qtools = QualityTools(Im, pd_gpu) +pars['rmse'] = Qtools.rmse() + +txtstr = printParametersToString(pars) +txtstr += "%s = %.3fs" % ('elapsed time',timeit.default_timer() - start_time) +print (txtstr) +a=fig.add_subplot(1,2,2) + +# these are matplotlib.patch.Patch properties +props = dict(boxstyle='round', facecolor='wheat', alpha=0.75) +# place a text box in upper left in axes coords +a.text(0.15, 0.25, txtstr, transform=a.transAxes, fontsize=14, + verticalalignment='top', bbox=props) +imgplot = plt.imshow(pd_gpu, cmap="gray") +plt.title('{}'.format('GPU results')) +#%% +print ("%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%") print ("____________SB-TV regulariser______________") print ("%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%") diff --git a/demos/demo_gpu_regularisers3D.py b/demos/demo_gpu_regularisers3D.py index ccf9694..2c25d01 100644 --- a/demos/demo_gpu_regularisers3D.py +++ b/demos/demo_gpu_regularisers3D.py @@ -12,7 +12,7 @@ import matplotlib.pyplot as plt import numpy as np import os import timeit -from ccpi.filters.regularisers import ROF_TV, FGP_TV, SB_TV, TGV, LLT_ROF, FGP_dTV, NDF, Diff4th +from ccpi.filters.regularisers import ROF_TV, FGP_TV, PD_TV, SB_TV, TGV, LLT_ROF, FGP_dTV, NDF, Diff4th from ccpi.supp.qualitymetrics import QualityTools ############################################################################### def printParametersToString(pars): @@ -172,7 +172,54 @@ a.text(0.15, 0.25, txtstr, transform=a.transAxes, fontsize=14, verticalalignment='top', bbox=props) imgplot = plt.imshow(fgp_gpu3D[10,:,:], cmap="gray") plt.title('{}'.format('Recovered volume on the GPU using FGP-TV')) +#%% +print ("%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%") +print ("_______________PD-TV (3D)__________________") +print ("%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%") + +## plot +fig = plt.figure() +plt.suptitle('Performance of PD-TV regulariser using the GPU') +a=fig.add_subplot(1,2,1) +a.set_title('Noisy Image') +imgplot = plt.imshow(noisyVol[10,:,:],cmap="gray") +# set parameters +pars = {'algorithm' : PD_TV, \ + 'input' : noisyVol,\ + 'regularisation_parameter':0.02, \ + 'number_of_iterations' :1000 ,\ + 'tolerance_constant':1e-06,\ + 'methodTV': 0 ,\ + 'nonneg': 0, + 'lipschitz_const' : 8, + 'tau' : 0.0025} + +print ("#############PD TV GPU####################") +start_time = timeit.default_timer() +(pd_gpu3D, info_vec_gpu) = PD_TV(pars['input'], + pars['regularisation_parameter'], + pars['number_of_iterations'], + pars['tolerance_constant'], + pars['methodTV'], + pars['nonneg'], + pars['lipschitz_const'], + pars['tau'],'gpu') + +Qtools = QualityTools(idealVol, pd_gpu3D) +pars['rmse'] = Qtools.rmse() +txtstr = printParametersToString(pars) +txtstr += "%s = %.3fs" % ('elapsed time',timeit.default_timer() - start_time) +print (txtstr) +a=fig.add_subplot(1,2,2) + +# these are matplotlib.patch.Patch properties +props = dict(boxstyle='round', facecolor='wheat', alpha=0.75) +# place a text box in upper left in axes coords +a.text(0.15, 0.25, txtstr, transform=a.transAxes, fontsize=14, + verticalalignment='top', bbox=props) +imgplot = plt.imshow(pd_gpu3D[10,:,:], cmap="gray") +plt.title('{}'.format('Recovered volume on the GPU using PD-TV')) #%% print ("%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%") print ("_______________SB-TV (3D)__________________") diff --git a/src/CMakeLists.txt b/src/CMakeLists.txt index 5fe1a57..f93c19a 100644 --- a/src/CMakeLists.txt +++ b/src/CMakeLists.txt @@ -17,4 +17,12 @@ if (BUILD_MATLAB_WRAPPER) endif() if (BUILD_PYTHON_WRAPPER) add_subdirectory(Python) -endif()
\ No newline at end of file +endif() +find_package(OpenMP REQUIRED) +if (OPENMP_FOUND) + set (CMAKE_C_FLAGS "${CMAKE_C_FLAGS} ${OpenMP_C_FLAGS}") + set (CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} ${OpenMP_CXX_FLAGS}") + set (CMAKE_EXE_LINKER_FLAGS "${CMAKE_EXE_LINKER_FLAGS} ${OpenMP_EXE_LINKER_FLAGS} ${OpenMP_CXX_FLAGS}") + set (CMAKE_SHARED_LINKER_FLAGS "${CMAKE_EXE_LINKER_FLAGS} ${OpenMP_SHARED_LINKER_FLAGS} ${OpenMP_CXX_FLAGS}") + set (CMAKE_STATIC_LINKER_FLAGS "${CMAKE_EXE_LINKER_FLAGS} ${OpenMP_STATIC_LINKER_FLAGS} ${OpenMP_CXX_FLAGS}") +endif() diff --git a/src/Core/CMakeLists.txt b/src/Core/CMakeLists.txt index eea0d63..4e43b5e 100644 --- a/src/Core/CMakeLists.txt +++ b/src/Core/CMakeLists.txt @@ -12,20 +12,18 @@ set (CIL_VERSION $ENV{CIL_VERSION} CACHE INTERNAL "Core Imaging Library version" message("CIL_VERSION ${CIL_VERSION}") #include (GenerateExportHeader) +## Build the regularisers package as a library +message("Creating Regularisers as a shared library") -find_package(OpenMP) +find_package(OpenMP REQUIRED) if (OPENMP_FOUND) set (CMAKE_C_FLAGS "${CMAKE_C_FLAGS} ${OpenMP_C_FLAGS}") set (CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} ${OpenMP_CXX_FLAGS}") set (CMAKE_EXE_LINKER_FLAGS "${CMAKE_EXE_LINKER_FLAGS} ${OpenMP_EXE_LINKER_FLAGS} ${OpenMP_CXX_FLAGS}") set (CMAKE_SHARED_LINKER_FLAGS "${CMAKE_EXE_LINKER_FLAGS} ${OpenMP_SHARED_LINKER_FLAGS} ${OpenMP_CXX_FLAGS}") set (CMAKE_STATIC_LINKER_FLAGS "${CMAKE_EXE_LINKER_FLAGS} ${OpenMP_STATIC_LINKER_FLAGS} ${OpenMP_CXX_FLAGS}") - endif() -## Build the regularisers package as a library -message("Creating Regularisers as a shared library") - message("CMAKE_CXX_FLAGS ${CMAKE_CXX_FLAGS}") message("CMAKE_C_FLAGS ${CMAKE_C_FLAGS}") message("CMAKE_EXE_LINKER_FLAGS ${CMAKE_EXE_LINKER_FLAGS}") @@ -39,21 +37,21 @@ if(WIN32) set (CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} ${FLAGS}") set (CMAKE_C_FLAGS "${CMAKE_CXX_FLAGS} ${FLAGS}") set (CMAKE_EXE_LINKER_FLAGS "${CMAKE_EXE_LINKER_FLAGS} /NODEFAULTLIB:MSVCRT.lib") - + set (EXTRA_LIBRARIES) - + message("library lib: ${LIBRARY_LIB}") - + elseif(UNIX) - set (FLAGS "-O2 -funsigned-char -Wall -Wl,--no-undefined -DCCPiReconstructionIterative_EXPORTS ") + set (FLAGS "-O2 -funsigned-char -Wall -Wl,--no-undefined -DCCPiReconstructionIterative_EXPORTS ") set (CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} ${FLAGS}") set (CMAKE_C_FLAGS "${CMAKE_CXX_FLAGS} ${FLAGS}") - - set (EXTRA_LIBRARIES + + set (EXTRA_LIBRARIES "gomp" "m" ) - + endif() message("CMAKE_CXX_FLAGS ${CMAKE_CXX_FLAGS}") @@ -66,6 +64,7 @@ message("Adding regularisers as a shared library") add_library(cilreg SHARED ${CMAKE_CURRENT_SOURCE_DIR}/regularisers_CPU/FGP_TV_core.c ${CMAKE_CURRENT_SOURCE_DIR}/regularisers_CPU/SB_TV_core.c + ${CMAKE_CURRENT_SOURCE_DIR}/regularisers_CPU/PD_TV_core.c ${CMAKE_CURRENT_SOURCE_DIR}/regularisers_CPU/TGV_core.c ${CMAKE_CURRENT_SOURCE_DIR}/regularisers_CPU/Diffusion_core.c ${CMAKE_CURRENT_SOURCE_DIR}/regularisers_CPU/Diffus4th_order_core.c @@ -80,8 +79,8 @@ add_library(cilreg SHARED ${CMAKE_CURRENT_SOURCE_DIR}/inpainters_CPU/NonlocalMarching_Inpaint_core.c ) target_link_libraries(cilreg ${EXTRA_LIBRARIES} ) -include_directories(cilreg PUBLIC - ${LIBRARY_INC}/include +include_directories(cilreg PUBLIC + ${LIBRARY_INC}/include ${CMAKE_CURRENT_SOURCE_DIR} ${CMAKE_CURRENT_SOURCE_DIR}/regularisers_CPU/ ${CMAKE_CURRENT_SOURCE_DIR}/inpainters_CPU/ ) @@ -92,14 +91,14 @@ if (UNIX) message ("I'd install into ${CMAKE_INSTALL_PREFIX}/lib") install(TARGETS cilreg LIBRARY DESTINATION lib - CONFIGURATIONS ${CMAKE_BUILD_TYPE} + CONFIGURATIONS ${CMAKE_BUILD_TYPE} ) elseif(WIN32) message ("I'd install into ${CMAKE_INSTALL_PREFIX} lib bin") - install(TARGETS cilreg + install(TARGETS cilreg RUNTIME DESTINATION bin ARCHIVE DESTINATION lib - CONFIGURATIONS ${CMAKE_BUILD_TYPE} + CONFIGURATIONS ${CMAKE_BUILD_TYPE} ) endif() @@ -114,6 +113,7 @@ if (BUILD_CUDA) CUDA_ADD_LIBRARY(cilregcuda SHARED ${CMAKE_CURRENT_SOURCE_DIR}/regularisers_GPU/TV_ROF_GPU_core.cu ${CMAKE_CURRENT_SOURCE_DIR}/regularisers_GPU/TV_FGP_GPU_core.cu + ${CMAKE_CURRENT_SOURCE_DIR}/regularisers_GPU/TV_PD_GPU_core.cu ${CMAKE_CURRENT_SOURCE_DIR}/regularisers_GPU/TV_SB_GPU_core.cu ${CMAKE_CURRENT_SOURCE_DIR}/regularisers_GPU/LLT_ROF_GPU_core.cu ${CMAKE_CURRENT_SOURCE_DIR}/regularisers_GPU/TGV_GPU_core.cu @@ -126,14 +126,14 @@ if (BUILD_CUDA) message ("I'd install into ${CMAKE_INSTALL_PREFIX}/lib") install(TARGETS cilregcuda LIBRARY DESTINATION lib - CONFIGURATIONS ${CMAKE_BUILD_TYPE} + CONFIGURATIONS ${CMAKE_BUILD_TYPE} ) elseif(WIN32) message ("I'd install into ${CMAKE_INSTALL_PREFIX} lib bin") install(TARGETS cilregcuda RUNTIME DESTINATION bin ARCHIVE DESTINATION lib - CONFIGURATIONS ${CMAKE_BUILD_TYPE} + CONFIGURATIONS ${CMAKE_BUILD_TYPE} ) endif() else() diff --git a/src/Core/regularisers_CPU/FGP_TV_core.c b/src/Core/regularisers_CPU/FGP_TV_core.c index a16a2e5..12f2254 100644 --- a/src/Core/regularisers_CPU/FGP_TV_core.c +++ b/src/Core/regularisers_CPU/FGP_TV_core.c @@ -46,12 +46,12 @@ float TV_FGP_CPU_main(float *Input, float *Output, float *infovector, float lamb float tk = 1.0f; float tkp1 =1.0f; int count = 0; - + if (dimZ <= 1) { /*2D case */ float *Output_prev=NULL, *P1=NULL, *P2=NULL, *P1_prev=NULL, *P2_prev=NULL, *R1=NULL, *R2=NULL; DimTotal = (long)(dimX*dimY); - + if (epsil != 0.0f) Output_prev = calloc(DimTotal, sizeof(float)); P1 = calloc(DimTotal, sizeof(float)); P2 = calloc(DimTotal, sizeof(float)); @@ -59,32 +59,32 @@ float TV_FGP_CPU_main(float *Input, float *Output, float *infovector, float lamb P2_prev = calloc(DimTotal, sizeof(float)); R1 = calloc(DimTotal, sizeof(float)); R2 = calloc(DimTotal, sizeof(float)); - + /* begin iterations */ for(ll=0; ll<iterationsNumb; ll++) { - + if ((epsil != 0.0f) && (ll % 5 == 0)) copyIm(Output, Output_prev, (long)(dimX), (long)(dimY), 1l); /* computing the gradient of the objective function */ Obj_func2D(Input, Output, R1, R2, lambdaPar, (long)(dimX), (long)(dimY)); - + /* apply nonnegativity */ if (nonneg == 1) for(j=0; j<DimTotal; j++) {if (Output[j] < 0.0f) Output[j] = 0.0f;} - + /*Taking a step towards minus of the gradient*/ Grad_func2D(P1, P2, Output, R1, R2, lambdaPar, (long)(dimX), (long)(dimY)); - + /* projection step */ Proj_func2D(P1, P2, methodTV, DimTotal); - + /*updating R and t*/ tkp1 = (1.0f + sqrtf(1.0f + 4.0f*tk*tk))*0.5f; Rupd_func2D(P1, P1_prev, P2, P2_prev, R1, R2, tkp1, tk, DimTotal); - + /*storing old values*/ copyIm(P1, P1_prev, (long)(dimX), (long)(dimY), 1l); copyIm(P2, P2_prev, (long)(dimX), (long)(dimY), 1l); tk = tkp1; - + /* check early stopping criteria */ if ((epsil != 0.0f) && (ll % 5 == 0)) { re = 0.0f; re1 = 0.0f; @@ -105,7 +105,7 @@ float TV_FGP_CPU_main(float *Input, float *Output, float *infovector, float lamb /*3D case*/ float *Output_prev=NULL, *P1=NULL, *P2=NULL, *P3=NULL, *P1_prev=NULL, *P2_prev=NULL, *P3_prev=NULL, *R1=NULL, *R2=NULL, *R3=NULL; DimTotal = (long)(dimX*dimY*dimZ); - + if (epsil != 0.0f) Output_prev = calloc(DimTotal, sizeof(float)); P1 = calloc(DimTotal, sizeof(float)); P2 = calloc(DimTotal, sizeof(float)); @@ -116,28 +116,28 @@ float TV_FGP_CPU_main(float *Input, float *Output, float *infovector, float lamb R1 = calloc(DimTotal, sizeof(float)); R2 = calloc(DimTotal, sizeof(float)); R3 = calloc(DimTotal, sizeof(float)); - + /* begin iterations */ for(ll=0; ll<iterationsNumb; ll++) { - + if ((epsil != 0.0f) && (ll % 5 == 0)) copyIm(Output, Output_prev, (long)(dimX), (long)(dimY), (long)(dimZ)); - + /* computing the gradient of the objective function */ Obj_func3D(Input, Output, R1, R2, R3, lambdaPar, (long)(dimX), (long)(dimY), (long)(dimZ)); - + /* apply nonnegativity */ if (nonneg == 1) for(j=0; j<DimTotal; j++) {if (Output[j] < 0.0f) Output[j] = 0.0f;} - + /*Taking a step towards minus of the gradient*/ Grad_func3D(P1, P2, P3, Output, R1, R2, R3, lambdaPar, (long)(dimX), (long)(dimY), (long)(dimZ)); - + /* projection step */ Proj_func3D(P1, P2, P3, methodTV, DimTotal); - + /*updating R and t*/ tkp1 = (1.0f + sqrtf(1.0f + 4.0f*tk*tk))*0.5f; Rupd_func3D(P1, P1_prev, P2, P2_prev, P3, P3_prev, R1, R2, R3, tkp1, tk, DimTotal); - + /* calculate norm - stopping rules*/ if ((epsil != 0.0f) && (ll % 5 == 0)) { re = 0.0f; re1 = 0.0f; @@ -151,22 +151,22 @@ float TV_FGP_CPU_main(float *Input, float *Output, float *infovector, float lamb if (re < epsil) count++; if (count > 3) break; } - + /*storing old values*/ copyIm(P1, P1_prev, (long)(dimX), (long)(dimY), (long)(dimZ)); copyIm(P2, P2_prev, (long)(dimX), (long)(dimY), (long)(dimZ)); copyIm(P3, P3_prev, (long)(dimX), (long)(dimY), (long)(dimZ)); tk = tkp1; } - + if (epsil != 0.0f) free(Output_prev); free(P1); free(P2); free(P3); free(P1_prev); free(P2_prev); free(P3_prev); free(R1); free(R2); free(R3); } - + /*adding info into info_vector */ infovector[0] = (float)(ll); /*iterations number (if stopped earlier based on tolerance)*/ infovector[1] = re; /* reached tolerance */ - + return 0; } @@ -202,36 +202,6 @@ float Grad_func2D(float *P1, float *P2, float *D, float *R1, float *R2, float la }} return 1; } -float Proj_func2D(float *P1, float *P2, int methTV, long DimTotal) -{ - float val1, val2, denom, sq_denom; - long i; - if (methTV == 0) { - /* isotropic TV*/ -#pragma omp parallel for shared(P1,P2) private(i,denom,sq_denom) - for(i=0; i<DimTotal; i++) { - denom = powf(P1[i],2) + powf(P2[i],2); - if (denom > 1.0f) { - sq_denom = 1.0f/sqrtf(denom); - P1[i] = P1[i]*sq_denom; - P2[i] = P2[i]*sq_denom; - } - } - } - else { - /* anisotropic TV*/ -#pragma omp parallel for shared(P1,P2) private(i,val1,val2) - for(i=0; i<DimTotal; i++) { - val1 = fabs(P1[i]); - val2 = fabs(P2[i]); - if (val1 < 1.0f) {val1 = 1.0f;} - if (val2 < 1.0f) {val2 = 1.0f;} - P1[i] = P1[i]/val1; - P2[i] = P2[i]/val2; - } - } - return 1; -} float Rupd_func2D(float *P1, float *P1_old, float *P2, float *P2_old, float *R1, float *R2, float tkp1, float tk, long DimTotal) { long i; @@ -284,40 +254,6 @@ float Grad_func3D(float *P1, float *P2, float *P3, float *D, float *R1, float *R }}} return 1; } -float Proj_func3D(float *P1, float *P2, float *P3, int methTV, long DimTotal) -{ - float val1, val2, val3, denom, sq_denom; - long i; - if (methTV == 0) { - /* isotropic TV*/ -#pragma omp parallel for shared(P1,P2,P3) private(i,val1,val2,val3,sq_denom) - for(i=0; i<DimTotal; i++) { - denom = powf(P1[i],2) + powf(P2[i],2) + powf(P3[i],2); - if (denom > 1.0f) { - sq_denom = 1.0f/sqrtf(denom); - P1[i] = P1[i]*sq_denom; - P2[i] = P2[i]*sq_denom; - P3[i] = P3[i]*sq_denom; - } - } - } - else { - /* anisotropic TV*/ -#pragma omp parallel for shared(P1,P2,P3) private(i,val1,val2,val3) - for(i=0; i<DimTotal; i++) { - val1 = fabs(P1[i]); - val2 = fabs(P2[i]); - val3 = fabs(P3[i]); - if (val1 < 1.0f) {val1 = 1.0f;} - if (val2 < 1.0f) {val2 = 1.0f;} - if (val3 < 1.0f) {val3 = 1.0f;} - P1[i] = P1[i]/val1; - P2[i] = P2[i]/val2; - P3[i] = P3[i]/val3; - } - } - return 1; -} float Rupd_func3D(float *P1, float *P1_old, float *P2, float *P2_old, float *P3, float *P3_old, float *R1, float *R2, float *R3, float tkp1, float tk, long DimTotal) { long i; diff --git a/src/Core/regularisers_CPU/FGP_TV_core.h b/src/Core/regularisers_CPU/FGP_TV_core.h index 04e6e80..e9c3cde 100644 --- a/src/Core/regularisers_CPU/FGP_TV_core.h +++ b/src/Core/regularisers_CPU/FGP_TV_core.h @@ -29,12 +29,12 @@ limitations under the License. /* C-OMP implementation of FGP-TV [1] denoising/regularization model (2D/3D case) * * Input Parameters: - * 1. Noisy image/volume - * 2. lambda - regularization parameter + * 1. Noisy image/volume + * 2. lambda - regularization parameter * 3. Number of iterations - * 4. eplsilon: tolerance constant + * 4. eplsilon: tolerance constant * 5. TV-type: methodTV - 'iso' (0) or 'l1' (1) - * 6. nonneg: 'nonnegativity (0 is OFF by default) + * 6. nonneg: 'nonnegativity (0 is OFF by default) * * Output: * [1] Filtered/regularized image/volume @@ -44,7 +44,7 @@ limitations under the License. * This function is based on the Matlab's code and paper by * [1] Amir Beck and Marc Teboulle, "Fast Gradient-Based Algorithms for Constrained Total Variation Image Denoising and Deblurring Problems" */ - + #ifdef __cplusplus extern "C" { #endif @@ -52,12 +52,10 @@ CCPI_EXPORT float TV_FGP_CPU_main(float *Input, float *Output, float *infovector CCPI_EXPORT float Obj_func2D(float *A, float *D, float *R1, float *R2, float lambda, long dimX, long dimY); CCPI_EXPORT float Grad_func2D(float *P1, float *P2, float *D, float *R1, float *R2, float lambda, long dimX, long dimY); -CCPI_EXPORT float Proj_func2D(float *P1, float *P2, int methTV, long DimTotal); CCPI_EXPORT float Rupd_func2D(float *P1, float *P1_old, float *P2, float *P2_old, float *R1, float *R2, float tkp1, float tk, long DimTotal); CCPI_EXPORT float Obj_func3D(float *A, float *D, float *R1, float *R2, float *R3, float lambda, long dimX, long dimY, long dimZ); CCPI_EXPORT float Grad_func3D(float *P1, float *P2, float *P3, float *D, float *R1, float *R2, float *R3, float lambda, long dimX, long dimY, long dimZ); -CCPI_EXPORT float Proj_func3D(float *P1, float *P2, float *P3, int methTV, long DimTotal); CCPI_EXPORT float Rupd_func3D(float *P1, float *P1_old, float *P2, float *P2_old, float *P3, float *P3_old, float *R1, float *R2, float *R3, float tkp1, float tk, long DimTotal); #ifdef __cplusplus } diff --git a/src/Core/regularisers_CPU/FGP_dTV_core.c b/src/Core/regularisers_CPU/FGP_dTV_core.c index afd7264..e828be6 100644 --- a/src/Core/regularisers_CPU/FGP_dTV_core.c +++ b/src/Core/regularisers_CPU/FGP_dTV_core.c @@ -52,11 +52,11 @@ float dTV_FGP_CPU_main(float *Input, float *InputRef, float *Output, float *info float tk = 1.0f; float tkp1=1.0f; int count = 0; - - + + float *Output_prev=NULL, *P1=NULL, *P2=NULL, *P1_prev=NULL, *P2_prev=NULL, *R1=NULL, *R2=NULL, *InputRef_x=NULL, *InputRef_y=NULL; DimTotal = (long)(dimX*dimY*dimZ); - + if (epsil != 0.0f) Output_prev = calloc(DimTotal, sizeof(float)); P1 = calloc(DimTotal, sizeof(float)); P2 = calloc(DimTotal, sizeof(float)); @@ -66,39 +66,39 @@ float dTV_FGP_CPU_main(float *Input, float *InputRef, float *Output, float *info R2 = calloc(DimTotal, sizeof(float)); InputRef_x = calloc(DimTotal, sizeof(float)); InputRef_y = calloc(DimTotal, sizeof(float)); - + if (dimZ <= 1) { /*2D case */ /* calculate gradient field (smoothed) for the reference image */ GradNorm_func2D(InputRef, InputRef_x, InputRef_y, eta, (long)(dimX), (long)(dimY)); - + /* begin iterations */ for(ll=0; ll<iterationsNumb; ll++) { - + if ((epsil != 0.0f) && (ll % 5 == 0)) copyIm(Output, Output_prev, (long)(dimX), (long)(dimY), 1l); /*projects a 2D vector field R-1,2 onto the orthogonal complement of another 2D vector field InputRef_xy*/ ProjectVect_func2D(R1, R2, InputRef_x, InputRef_y, (long)(dimX), (long)(dimY)); - + /* computing the gradient of the objective function */ Obj_dfunc2D(Input, Output, R1, R2, lambdaPar, (long)(dimX), (long)(dimY)); - + /* apply nonnegativity */ if (nonneg == 1) for(j=0; j<DimTotal; j++) {if (Output[j] < 0.0f) Output[j] = 0.0f;} - + /*Taking a step towards minus of the gradient*/ Grad_dfunc2D(P1, P2, Output, R1, R2, InputRef_x, InputRef_y, lambdaPar, (long)(dimX), (long)(dimY)); - + /* projection step */ - Proj_dfunc2D(P1, P2, methodTV, DimTotal); - + Proj_func2D(P1, P2, methodTV, DimTotal); + /*updating R and t*/ tkp1 = (1.0f + sqrt(1.0f + 4.0f*tk*tk))*0.5f; Rupd_dfunc2D(P1, P1_prev, P2, P2_prev, R1, R2, tkp1, tk, DimTotal); - + copyIm(P1, P1_prev, (long)(dimX), (long)(dimY), 1l); copyIm(P2, P2_prev, (long)(dimX), (long)(dimY), 1l); tk = tkp1; - + /* check early stopping criteria */ if ((epsil != 0.0f) && (ll % 5 == 0)) { re = 0.0f; re1 = 0.0f; @@ -116,45 +116,45 @@ float dTV_FGP_CPU_main(float *Input, float *InputRef, float *Output, float *info else { /*3D case*/ float *P3=NULL, *P3_prev=NULL, *R3=NULL, *InputRef_z=NULL; - + P3 = calloc(DimTotal, sizeof(float)); P3_prev = calloc(DimTotal, sizeof(float)); R3 = calloc(DimTotal, sizeof(float)); InputRef_z = calloc(DimTotal, sizeof(float)); - + /* calculate gradient field (smoothed) for the reference volume */ GradNorm_func3D(InputRef, InputRef_x, InputRef_y, InputRef_z, eta, (long)(dimX), (long)(dimY), (long)(dimZ)); - + /* begin iterations */ for(ll=0; ll<iterationsNumb; ll++) { - + if ((epsil != 0.0f) && (ll % 5 == 0)) copyIm(Output, Output_prev, (long)(dimX), (long)(dimY), (long)(dimZ)); - + /*projects a 3D vector field R-1,2,3 onto the orthogonal complement of another 3D vector field InputRef_xyz*/ ProjectVect_func3D(R1, R2, R3, InputRef_x, InputRef_y, InputRef_z, (long)(dimX), (long)(dimY), (long)(dimZ)); - + /* computing the gradient of the objective function */ Obj_dfunc3D(Input, Output, R1, R2, R3, lambdaPar, (long)(dimX), (long)(dimY), (long)(dimZ)); - + /* apply nonnegativity */ if (nonneg == 1) for(j=0; j<DimTotal; j++) {if (Output[j] < 0.0f) Output[j] = 0.0f;} - + /*Taking a step towards minus of the gradient*/ Grad_dfunc3D(P1, P2, P3, Output, R1, R2, R3, InputRef_x, InputRef_y, InputRef_z, lambdaPar, (long)(dimX), (long)(dimY), (long)(dimZ)); - + /* projection step */ - Proj_dfunc3D(P1, P2, P3, methodTV, DimTotal); - + Proj_func3D(P1, P2, P3, methodTV, DimTotal); + /*updating R and t*/ tkp1 = (1.0f + sqrt(1.0f + 4.0f*tk*tk))*0.5f; Rupd_dfunc3D(P1, P1_prev, P2, P2_prev, P3, P3_prev, R1, R2, R3, tkp1, tk, DimTotal); - + /*storing old values*/ copyIm(P1, P1_prev, (long)(dimX), (long)(dimY), (long)(dimZ)); copyIm(P2, P2_prev, (long)(dimX), (long)(dimY), (long)(dimZ)); copyIm(P3, P3_prev, (long)(dimX), (long)(dimY), (long)(dimZ)); tk = tkp1; - + /* check early stopping criteria */ if ((epsil != 0.0f) && (ll % 5 == 0)) { re = 0.0f; re1 = 0.0f; @@ -168,16 +168,16 @@ float dTV_FGP_CPU_main(float *Input, float *InputRef, float *Output, float *info if (count > 3) break; } } - + free(P3); free(P3_prev); free(R3); free(InputRef_z); } if (epsil != 0.0f) free(Output_prev); free(P1); free(P2); free(P1_prev); free(P2_prev); free(R1); free(R2); free(InputRef_x); free(InputRef_y); - + /*adding info into info_vector */ infovector[0] = (float)(ll); /*iterations number (if stopped earlier based on tolerance)*/ infovector[1] = re; /* reached tolerance */ - + return 0; } @@ -185,7 +185,6 @@ float dTV_FGP_CPU_main(float *Input, float *InputRef, float *Output, float *info /********************************************************************/ /***************************2D Functions*****************************/ /********************************************************************/ - float GradNorm_func2D(float *B, float *B_x, float *B_y, float eta, long dimX, long dimY) { long i,j,index; @@ -249,47 +248,17 @@ float Grad_dfunc2D(float *P1, float *P2, float *D, float *R1, float *R2, float * /* boundary conditions */ if (i == dimX-1) val1 = 0.0f; else val1 = D[index] - D[j*dimX + (i+1)]; if (j == dimY-1) val2 = 0.0f; else val2 = D[index] - D[(j+1)*dimX + i]; - + in_prod = val1*B_x[index] + val2*B_y[index]; /* calculate inner product */ val1 = val1 - in_prod*B_x[index]; val2 = val2 - in_prod*B_y[index]; - + P1[index] = R1[index] + multip*val1; P2[index] = R2[index] + multip*val2; - + }} return 1; } -float Proj_dfunc2D(float *P1, float *P2, int methTV, long DimTotal) -{ - float val1, val2, denom, sq_denom; - long i; - if (methTV == 0) { - /* isotropic TV*/ -#pragma omp parallel for shared(P1,P2) private(i,denom,sq_denom) - for(i=0; i<DimTotal; i++) { - denom = powf(P1[i],2) + powf(P2[i],2); - if (denom > 1.0f) { - sq_denom = 1.0f/sqrtf(denom); - P1[i] = P1[i]*sq_denom; - P2[i] = P2[i]*sq_denom; - } - } - } - else { - /* anisotropic TV*/ -#pragma omp parallel for shared(P1,P2) private(i,val1,val2) - for(i=0; i<DimTotal; i++) { - val1 = fabs(P1[i]); - val2 = fabs(P2[i]); - if (val1 < 1.0f) {val1 = 1.0f;} - if (val2 < 1.0f) {val2 = 1.0f;} - P1[i] = P1[i]/val1; - P2[i] = P2[i]/val2; - } - } - return 1; -} float Rupd_dfunc2D(float *P1, float *P1_old, float *P2, float *P2_old, float *R1, float *R2, float tkp1, float tk, long DimTotal) { long i; @@ -314,14 +283,14 @@ float GradNorm_func3D(float *B, float *B_x, float *B_y, float *B_z, float eta, l for(k=0; k<dimZ; k++) { for(j=0; j<dimY; j++) { for(i=0; i<dimX; i++) { - + index = (dimX*dimY)*k + j*dimX+i; - + /* zero boundary conditions */ if (i == dimX-1) {val1 = 0.0f;} else {val1 = B[(dimX*dimY)*k + j*dimX+(i+1)];} if (j == dimY-1) {val2 = 0.0f;} else {val2 = B[(dimX*dimY)*k + (j+1)*dimX+i];} if (k == dimZ-1) {val3 = 0.0f;} else {val3 = B[(dimX*dimY)*(k+1) + (j)*dimX+i];} - + gradX = val1 - B[index]; gradY = val2 - B[index]; gradZ = val3 - B[index]; @@ -382,52 +351,18 @@ float Grad_dfunc3D(float *P1, float *P2, float *P3, float *D, float *R1, float * if (i == dimX-1) val1 = 0.0f; else val1 = D[index] - D[(dimX*dimY)*k + j*dimX + (i+1)]; if (j == dimY-1) val2 = 0.0f; else val2 = D[index] - D[(dimX*dimY)*k + (j+1)*dimX + i]; if (k == dimZ-1) val3 = 0.0f; else val3 = D[index] - D[(dimX*dimY)*(k+1) + j*dimX + i]; - + in_prod = val1*B_x[index] + val2*B_y[index] + val3*B_z[index]; /* calculate inner product */ val1 = val1 - in_prod*B_x[index]; val2 = val2 - in_prod*B_y[index]; val3 = val3 - in_prod*B_z[index]; - + P1[index] = R1[index] + multip*val1; P2[index] = R2[index] + multip*val2; P3[index] = R3[index] + multip*val3; }}} return 1; } -float Proj_dfunc3D(float *P1, float *P2, float *P3, int methTV, long DimTotal) -{ - float val1, val2, val3, denom, sq_denom; - long i; - if (methTV == 0) { - /* isotropic TV*/ -#pragma omp parallel for shared(P1,P2,P3) private(i,val1,val2,val3,sq_denom) - for(i=0; i<DimTotal; i++) { - denom = powf(P1[i],2) + powf(P2[i],2) + powf(P3[i],2); - if (denom > 1.0f) { - sq_denom = 1.0f/sqrtf(denom); - P1[i] = P1[i]*sq_denom; - P2[i] = P2[i]*sq_denom; - P3[i] = P3[i]*sq_denom; - } - } - } - else { - /* anisotropic TV*/ -#pragma omp parallel for shared(P1,P2,P3) private(i,val1,val2,val3) - for(i=0; i<DimTotal; i++) { - val1 = fabs(P1[i]); - val2 = fabs(P2[i]); - val3 = fabs(P3[i]); - if (val1 < 1.0f) {val1 = 1.0f;} - if (val2 < 1.0f) {val2 = 1.0f;} - if (val3 < 1.0f) {val3 = 1.0f;} - P1[i] = P1[i]/val1; - P2[i] = P2[i]/val2; - P3[i] = P3[i]/val3; - } - } - return 1; -} float Rupd_dfunc3D(float *P1, float *P1_old, float *P2, float *P2_old, float *P3, float *P3_old, float *R1, float *R2, float *R3, float tkp1, float tk, long DimTotal) { long i; diff --git a/src/Core/regularisers_CPU/FGP_dTV_core.h b/src/Core/regularisers_CPU/FGP_dTV_core.h index 9ace06d..8582170 100644 --- a/src/Core/regularisers_CPU/FGP_dTV_core.h +++ b/src/Core/regularisers_CPU/FGP_dTV_core.h @@ -59,14 +59,12 @@ CCPI_EXPORT float GradNorm_func2D(float *B, float *B_x, float *B_y, float eta, l CCPI_EXPORT float ProjectVect_func2D(float *R1, float *R2, float *B_x, float *B_y, long dimX, long dimY); CCPI_EXPORT float Obj_dfunc2D(float *A, float *D, float *R1, float *R2, float lambda, long dimX, long dimY); CCPI_EXPORT float Grad_dfunc2D(float *P1, float *P2, float *D, float *R1, float *R2, float *B_x, float *B_y, float lambda, long dimX, long dimY); -CCPI_EXPORT float Proj_dfunc2D(float *P1, float *P2, int methTV, long DimTotal); CCPI_EXPORT float Rupd_dfunc2D(float *P1, float *P1_old, float *P2, float *P2_old, float *R1, float *R2, float tkp1, float tk, long DimTotal); CCPI_EXPORT float GradNorm_func3D(float *B, float *B_x, float *B_y, float *B_z, float eta, long dimX, long dimY, long dimZ); CCPI_EXPORT float ProjectVect_func3D(float *R1, float *R2, float *R3, float *B_x, float *B_y, float *B_z, long dimX, long dimY, long dimZ); CCPI_EXPORT float Obj_dfunc3D(float *A, float *D, float *R1, float *R2, float *R3, float lambda, long dimX, long dimY, long dimZ); CCPI_EXPORT float Grad_dfunc3D(float *P1, float *P2, float *P3, float *D, float *R1, float *R2, float *R3, float *B_x, float *B_y, float *B_z, float lambda, long dimX, long dimY, long dimZ); -CCPI_EXPORT float Proj_dfunc3D(float *P1, float *P2, float *P3, int methTV, long DimTotal); CCPI_EXPORT float Rupd_dfunc3D(float *P1, float *P1_old, float *P2, float *P2_old, float *P3, float *P3_old, float *R1, float *R2, float *R3, float tkp1, float tk, long DimTotal); #ifdef __cplusplus } diff --git a/src/Core/regularisers_CPU/PD_TV_core.c b/src/Core/regularisers_CPU/PD_TV_core.c new file mode 100644 index 0000000..534091b --- /dev/null +++ b/src/Core/regularisers_CPU/PD_TV_core.c @@ -0,0 +1,252 @@ +/* + * This work is part of the Core Imaging Library developed by + * Visual Analytics and Imaging System Group of the Science Technology + * Facilities Council, STFC + * + * Copyright 2019 Daniil Kazantsev + * Copyright 2019 Srikanth Nagella, Edoardo Pasca + * + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * http://www.apache.org/licenses/LICENSE-2.0 + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +#include "PD_TV_core.h" + +/* C-OMP implementation of Primal-Dual TV [1] by Chambolle Pock denoising/regularization model (2D/3D case) + * + * Input Parameters: + * 1. Noisy image/volume + * 2. lambdaPar - regularization parameter + * 3. Number of iterations + * 4. eplsilon: tolerance constant + * 5. lipschitz_const: convergence related parameter + * 6. TV-type: methodTV - 'iso' (0) or 'l1' (1) + * 7. nonneg: 'nonnegativity (0 is OFF by default, 1 is ON) + * 8. tau: time marching parameter + + * Output: + * [1] TV - Filtered/regularized image/volume + * [2] Information vector which contains [iteration no., reached tolerance] + * + * [1] Antonin Chambolle, Thomas Pock. "A First-Order Primal-Dual Algorithm for Convex Problems with Applications to Imaging", 2010 + */ + +float PDTV_CPU_main(float *Input, float *U, float *infovector, float lambdaPar, int iterationsNumb, float epsil, float lipschitz_const, int methodTV, int nonneg, float tau, int dimX, int dimY, int dimZ) +{ + int ll; + long j, DimTotal; + float re, re1, sigma, theta, lt; + re = 0.0f; re1 = 0.0f; + int count = 0; + + //tau = 1.0/powf(lipschitz_const,0.5); + //sigma = 1.0/powf(lipschitz_const,0.5); + sigma = 1.0/(lipschitz_const*tau); + theta = 1.0f; + lt = tau/lambdaPar; + ll = 0; + + + DimTotal = (long)(dimX*dimY*dimZ); + + copyIm(Input, U, (long)(dimX), (long)(dimY), (long)(dimZ)); + + if (dimZ <= 1) { + /*2D case */ + float *U_old=NULL, *P1=NULL, *P2=NULL; + + U_old = calloc(DimTotal, sizeof(float)); + P1 = calloc(DimTotal, sizeof(float)); + P2 = calloc(DimTotal, sizeof(float)); + + /* begin iterations */ + for(ll=0; ll<iterationsNumb; ll++) { + + /* computing the the dual P variable */ + DualP2D(U, P1, P2, (long)(dimX), (long)(dimY), sigma); + + /* apply nonnegativity */ + if (nonneg == 1) for(j=0; j<DimTotal; j++) {if (U[j] < 0.0f) U[j] = 0.0f;} + + /* projection step */ + Proj_func2D(P1, P2, methodTV, DimTotal); + + /* copy U to U_old */ + copyIm(U, U_old, (long)(dimX), (long)(dimY), 1l); + + /* calculate divergence */ + DivProj2D(U, Input, P1, P2,(long)(dimX), (long)(dimY), lt, tau); + + /* check early stopping criteria */ + if ((epsil != 0.0f) && (ll % 5 == 0)) { + re = 0.0f; re1 = 0.0f; + for(j=0; j<DimTotal; j++) + { + re += powf(U[j] - U_old[j],2); + re1 += powf(U[j],2); + } + re = sqrtf(re)/sqrtf(re1); + if (re < epsil) count++; + if (count > 3) break; + } + /*get updated solution*/ + + getX(U, U_old, theta, DimTotal); + } + free(P1); free(P2); free(U_old); + } + else { + /*3D case*/ + float *U_old=NULL, *P1=NULL, *P2=NULL, *P3=NULL; + U_old = calloc(DimTotal, sizeof(float)); + P1 = calloc(DimTotal, sizeof(float)); + P2 = calloc(DimTotal, sizeof(float)); + P3 = calloc(DimTotal, sizeof(float)); + + /* begin iterations */ + for(ll=0; ll<iterationsNumb; ll++) { + + /* computing the the dual P variable */ + DualP3D(U, P1, P2, P3, (long)(dimX), (long)(dimY), (long)(dimZ), sigma); + + /* apply nonnegativity */ + if (nonneg == 1) for(j=0; j<DimTotal; j++) {if (U[j] < 0.0f) U[j] = 0.0f;} + + /* projection step */ + Proj_func3D(P1, P2, P3, methodTV, DimTotal); + + /* copy U to U_old */ + copyIm(U, U_old, (long)(dimX), (long)(dimY), (long)(dimZ)); + + DivProj3D(U, Input, P1, P2, P3, (long)(dimX), (long)(dimY), (long)(dimZ), lt, tau); + + /* check early stopping criteria */ + if ((epsil != 0.0f) && (ll % 5 == 0)) { + re = 0.0f; re1 = 0.0f; + for(j=0; j<DimTotal; j++) + { + re += powf(U[j] - U_old[j],2); + re1 += powf(U[j],2); + } + re = sqrtf(re)/sqrtf(re1); + if (re < epsil) count++; + if (count > 3) break; + } + /*get updated solution*/ + + getX(U, U_old, theta, DimTotal); + } + free(P1); free(P2); free(P3); free(U_old); + } + /*adding info into info_vector */ + infovector[0] = (float)(ll); /*iterations number (if stopped earlier based on tolerance)*/ + infovector[1] = re; /* reached tolerance */ + + return 0; +} + +/*****************************************************************/ +/************************2D-case related Functions */ +/*****************************************************************/ + +/*Calculating dual variable (using forward differences)*/ +float DualP2D(float *U, float *P1, float *P2, long dimX, long dimY, float sigma) +{ + long i,j,index; + #pragma omp parallel for shared(U,P1,P2) private(index,i,j) + for(j=0; j<dimY; j++) { + for(i=0; i<dimX; i++) { + index = j*dimX+i; + /* symmetric boundary conditions (Neuman) */ + if (i == dimX-1) P1[index] += sigma*(U[j*dimX+(i-1)] - U[index]); + else P1[index] += sigma*(U[j*dimX+(i+1)] - U[index]); + if (j == dimY-1) P2[index] += sigma*(U[(j-1)*dimX+i] - U[index]); + else P2[index] += sigma*(U[(j+1)*dimX+i] - U[index]); + }} + return 1; +} + +/* Divergence for P dual */ +float DivProj2D(float *U, float *Input, float *P1, float *P2, long dimX, long dimY, float lt, float tau) +{ + long i,j,index; + float P_v1, P_v2, div_var; + #pragma omp parallel for shared(U,Input,P1,P2) private(index, i, j, P_v1, P_v2, div_var) + for(j=0; j<dimY; j++) { + for(i=0; i<dimX; i++) { + index = j*dimX+i; + /* symmetric boundary conditions (Neuman) */ + if (i == 0) P_v1 = -P1[index]; + else P_v1 = -(P1[index] - P1[j*dimX+(i-1)]); + if (j == 0) P_v2 = -P2[index]; + else P_v2 = -(P2[index] - P2[(j-1)*dimX+i]); + div_var = P_v1 + P_v2; + U[index] = (U[index] - tau*div_var + lt*Input[index])/(1.0 + lt); + }} + return *U; +} + +/*get the updated solution*/ +float getX(float *U, float *U_old, float theta, long DimTotal) +{ + long i; + #pragma omp parallel for shared(U,U_old) private(i) + for(i=0; i<DimTotal; i++) { + U[i] += theta*(U[i] - U_old[i]); + } + return *U; +} + + +/*****************************************************************/ +/************************3D-case related Functions */ +/*****************************************************************/ +/*Calculating dual variable (using forward differences)*/ +float DualP3D(float *U, float *P1, float *P2, float *P3, long dimX, long dimY, long dimZ, float sigma) +{ + long i,j,k,index; + #pragma omp parallel for shared(U,P1,P2,P3) private(index,i,j,k) + for(k=0; k<dimZ; k++) { + for(j=0; j<dimY; j++) { + for(i=0; i<dimX; i++) { + index = (dimX*dimY)*k + j*dimX+i; + /* symmetric boundary conditions (Neuman) */ + if (i == dimX-1) P1[index] += sigma*(U[(dimX*dimY)*k + j*dimX+(i-1)] - U[index]); + else P1[index] += sigma*(U[(dimX*dimY)*k + j*dimX+(i+1)] - U[index]); + if (j == dimY-1) P2[index] += sigma*(U[(dimX*dimY)*k + (j-1)*dimX+i] - U[index]); + else P2[index] += sigma*(U[(dimX*dimY)*k + (j+1)*dimX+i] - U[index]); + if (k == dimZ-1) P3[index] += sigma*(U[(dimX*dimY)*(k-1) + j*dimX+i] - U[index]); + else P3[index] += sigma*(U[(dimX*dimY)*(k+1) + j*dimX+i] - U[index]); + }}} + return 1; +} + +/* Divergence for P dual */ +float DivProj3D(float *U, float *Input, float *P1, float *P2, float *P3, long dimX, long dimY, long dimZ, float lt, float tau) +{ + long i,j,k,index; + float P_v1, P_v2, P_v3, div_var; + #pragma omp parallel for shared(U,Input,P1,P2) private(index, i, j, k, P_v1, P_v2, P_v3, div_var) + for(k=0; k<dimZ; k++) { + for(j=0; j<dimY; j++) { + for(i=0; i<dimX; i++) { + index = (dimX*dimY)*k + j*dimX+i; + /* symmetric boundary conditions (Neuman) */ + if (i == 0) P_v1 = -P1[index]; + else P_v1 = -(P1[index] - P1[(dimX*dimY)*k + j*dimX+(i-1)]); + if (j == 0) P_v2 = -P2[index]; + else P_v2 = -(P2[index] - P2[(dimX*dimY)*k + (j-1)*dimX+i]); + if (k == 0) P_v3 = -P3[index]; + else P_v3 = -(P3[index] - P3[(dimX*dimY)*(k-1) + j*dimX+i]); + div_var = P_v1 + P_v2 + P_v3; + U[index] = (U[index] - tau*div_var + lt*Input[index])/(1.0 + lt); + }}} + return *U; +} diff --git a/src/Core/regularisers_CPU/PD_TV_core.h b/src/Core/regularisers_CPU/PD_TV_core.h new file mode 100644 index 0000000..97edc05 --- /dev/null +++ b/src/Core/regularisers_CPU/PD_TV_core.h @@ -0,0 +1,60 @@ +/* +This work is part of the Core Imaging Library developed by +Visual Analytics and Imaging System Group of the Science Technology +Facilities Council, STFC + +Copyright 2019 Daniil Kazantsev +Copyright 2019 Srikanth Nagella, Edoardo Pasca + +Licensed under the Apache License, Version 2.0 (the "License"); +you may not use this file except in compliance with the License. +You may obtain a copy of the License at +http://www.apache.org/licenses/LICENSE-2.0 +Unless required by applicable law or agreed to in writing, software +distributed under the License is distributed on an "AS IS" BASIS, +WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +See the License for the specific language governing permissions and +limitations under the License. +*/ + +//#include <matrix.h> +#include <math.h> +#include <stdlib.h> +#include <memory.h> +#include <stdio.h> +#include "omp.h" +#include "utils.h" +#include "CCPiDefines.h" + +/* C-OMP implementation of Primal-Dual TV [1] by Chambolle Pock denoising/regularization model (2D/3D case) + * + * Input Parameters: + * 1. Noisy image/volume + * 2. lambdaPar - regularization parameter + * 3. Number of iterations + * 4. eplsilon: tolerance constant + * 5. lipschitz_const: convergence related parameter + * 6. TV-type: methodTV - 'iso' (0) or 'l1' (1) + * 7. nonneg: 'nonnegativity (0 is OFF by default, 1 is ON) + + * Output: + * [1] TV - Filtered/regularized image/volume + * [2] Information vector which contains [iteration no., reached tolerance] + * + * [1] Antonin Chambolle, Thomas Pock. "A First-Order Primal-Dual Algorithm for Convex Problems with Applications to Imaging", 2010 + */ + +#ifdef __cplusplus +extern "C" { +#endif +float PDTV_CPU_main(float *Input, float *U, float *infovector, float lambdaPar, int iterationsNumb, float epsil, float lipschitz_const, int methodTV, int nonneg, float tau, int dimX, int dimY, int dimZ); + +CCPI_EXPORT float DualP2D(float *U, float *P1, float *P2, long dimX, long dimY, float sigma); +CCPI_EXPORT float DivProj2D(float *U, float *Input, float *P1, float *P2, long dimX, long dimY, float lt, float tau); +CCPI_EXPORT float getX(float *U, float *U_old, float theta, long DimTotal); + +CCPI_EXPORT float DualP3D(float *U, float *P1, float *P2, float *P3, long dimX, long dimY, long dimZ, float sigma); +CCPI_EXPORT float DivProj3D(float *U, float *Input, float *P1, float *P2, float *P3, long dimX, long dimY, long dimZ, float lt, float tau); +#ifdef __cplusplus +} +#endif diff --git a/src/Core/regularisers_CPU/utils.c b/src/Core/regularisers_CPU/utils.c index 5bb3a5c..088ace9 100644 --- a/src/Core/regularisers_CPU/utils.c +++ b/src/Core/regularisers_CPU/utils.c @@ -65,7 +65,7 @@ float TV_energy2D(float *U, float *U0, float *E_val, float lambda, int type, int { int i, j, i1, j1, index; float NOMx_2, NOMy_2, E_Grad=0.0f, E_Data=0.0f; - + /* first calculate \grad U_xy*/ for(j=0; j<dimY; j++) { for(i=0; i<dimX; i++) { @@ -73,7 +73,7 @@ float TV_energy2D(float *U, float *U0, float *E_val, float lambda, int type, int /* boundary conditions */ i1 = i + 1; if (i == dimX-1) i1 = i; j1 = j + 1; if (j == dimY-1) j1 = j; - + /* Forward differences */ NOMx_2 = powf((float)(U[j1*dimX + i] - U[index]),2); /* x+ */ NOMy_2 = powf((float)(U[j*dimX + i1] - U[index]),2); /* y+ */ @@ -90,7 +90,7 @@ float TV_energy3D(float *U, float *U0, float *E_val, float lambda, int type, int { long i, j, k, i1, j1, k1, index; float NOMx_2, NOMy_2, NOMz_2, E_Grad=0.0f, E_Data=0.0f; - + /* first calculate \grad U_xy*/ for(j=0; j<(long)(dimY); j++) { for(i=0; i<(long)(dimX); i++) { @@ -100,12 +100,12 @@ float TV_energy3D(float *U, float *U0, float *E_val, float lambda, int type, int i1 = i + 1; if (i == (long)(dimX-1)) i1 = i; j1 = j + 1; if (j == (long)(dimY-1)) j1 = j; k1 = k + 1; if (k == (long)(dimZ-1)) k1 = k; - + /* Forward differences */ NOMx_2 = powf((float)(U[(dimX*dimY)*k + j1*dimX+i] - U[index]),2); /* x+ */ NOMy_2 = powf((float)(U[(dimX*dimY)*k + j*dimX+i1] - U[index]),2); /* y+ */ NOMz_2 = powf((float)(U[(dimX*dimY)*k1 + j*dimX+i] - U[index]),2); /* z+ */ - + E_Grad += 2.0f*lambda*sqrtf((float)(NOMx_2) + (float)(NOMy_2) + (float)(NOMz_2)); /* gradient term energy */ E_Data += (powf((float)(U[index]-U0[index]),2)); /* fidelity term energy */ } @@ -131,12 +131,12 @@ float Im_scale2D(float *Input, float *Scaled, int w, int h, int w2, int h2) x_diff = (x_ratio * j) - x; y_diff = (y_ratio * i) - y; index = y*w+x ; - + A = Input[index]; B = Input[index+1]; C = Input[index+w]; D = Input[index+w+1]; - + gray = (float)(A*(1.0 - x_diff)*(1.0 - y_diff) + B*(x_diff)*(1.0 - y_diff) + C*(y_diff)*(1.0 - x_diff) + D*(x_diff*y_diff)); @@ -144,3 +144,70 @@ float Im_scale2D(float *Input, float *Scaled, int w, int h, int w2, int h2) }} return *Scaled; } + +/*2D Projection onto convex set for P (called in PD_TV, FGP_dTV and FGP_TV methods)*/ +float Proj_func2D(float *P1, float *P2, int methTV, long DimTotal) +{ + float val1, val2, denom, sq_denom; + long i; + if (methTV == 0) { + /* isotropic TV*/ +#pragma omp parallel for shared(P1,P2) private(i,denom,sq_denom) + for(i=0; i<DimTotal; i++) { + denom = powf(P1[i],2) + powf(P2[i],2); + if (denom > 1.0f) { + sq_denom = 1.0f/sqrtf(denom); + P1[i] = P1[i]*sq_denom; + P2[i] = P2[i]*sq_denom; + } + } + } + else { + /* anisotropic TV*/ +#pragma omp parallel for shared(P1,P2) private(i,val1,val2) + for(i=0; i<DimTotal; i++) { + val1 = fabs(P1[i]); + val2 = fabs(P2[i]); + if (val1 < 1.0f) {val1 = 1.0f;} + if (val2 < 1.0f) {val2 = 1.0f;} + P1[i] = P1[i]/val1; + P2[i] = P2[i]/val2; + } + } + return 1; +} +/*3D Projection onto convex set for P (called in PD_TV, FGP_TV, FGP_dTV methods)*/ +float Proj_func3D(float *P1, float *P2, float *P3, int methTV, long DimTotal) +{ + float val1, val2, val3, denom, sq_denom; + long i; + if (methTV == 0) { + /* isotropic TV*/ +#pragma omp parallel for shared(P1,P2,P3) private(i,val1,val2,val3,sq_denom) + for(i=0; i<DimTotal; i++) { + denom = powf(P1[i],2) + powf(P2[i],2) + powf(P3[i],2); + if (denom > 1.0f) { + sq_denom = 1.0f/sqrtf(denom); + P1[i] = P1[i]*sq_denom; + P2[i] = P2[i]*sq_denom; + P3[i] = P3[i]*sq_denom; + } + } + } + else { + /* anisotropic TV*/ +#pragma omp parallel for shared(P1,P2,P3) private(i,val1,val2,val3) + for(i=0; i<DimTotal; i++) { + val1 = fabs(P1[i]); + val2 = fabs(P2[i]); + val3 = fabs(P3[i]); + if (val1 < 1.0f) {val1 = 1.0f;} + if (val2 < 1.0f) {val2 = 1.0f;} + if (val3 < 1.0f) {val3 = 1.0f;} + P1[i] = P1[i]/val1; + P2[i] = P2[i]/val2; + P3[i] = P3[i]/val3; + } + } + return 1; +} diff --git a/src/Core/regularisers_CPU/utils.h b/src/Core/regularisers_CPU/utils.h index 8f0ba59..4b57f71 100644 --- a/src/Core/regularisers_CPU/utils.h +++ b/src/Core/regularisers_CPU/utils.h @@ -31,6 +31,8 @@ CCPI_EXPORT float TV_energy2D(float *U, float *U0, float *E_val, float lambda, i CCPI_EXPORT float TV_energy3D(float *U, float *U0, float *E_val, float lambda, int type, int dimX, int dimY, int dimZ); CCPI_EXPORT float TV_energy3D(float *U, float *U0, float *E_val, float lambda, int type, int dimX, int dimY, int dimZ); CCPI_EXPORT float Im_scale2D(float *Input, float *Scaled, int w, int h, int w2, int h2); +CCPI_EXPORT float Proj_func2D(float *P1, float *P2, int methTV, long DimTotal); +CCPI_EXPORT float Proj_func3D(float *P1, float *P2, float *P3, int methTV, long DimTotal); #ifdef __cplusplus } #endif diff --git a/src/Core/regularisers_GPU/TV_PD_GPU_core.cu b/src/Core/regularisers_GPU/TV_PD_GPU_core.cu new file mode 100644 index 0000000..59fda3b --- /dev/null +++ b/src/Core/regularisers_GPU/TV_PD_GPU_core.cu @@ -0,0 +1,522 @@ +/* +This work is part of the Core Imaging Library developed by +Visual Analytics and Imaging System Group of the Science Technology +Facilities Council, STFC + +Copyright 2017 Daniil Kazantsev +Copyright 2017 Srikanth Nagella, Edoardo Pasca + +Licensed under the Apache License, Version 2.0 (the "License"); +you may not use this file except in compliance with the License. +You may obtain a copy of the License at +http://www.apache.org/licenses/LICENSE-2.0 +Unless required by applicable law or agreed to in writing, software +distributed under the License is distributed on an "AS IS" BASIS, +WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +See the License for the specific language governing permissions and +limitations under the License. +*/ + +#include "TV_PD_GPU_core.h" +#include "shared.h" +#include <thrust/functional.h> +#include <thrust/device_vector.h> +#include <thrust/transform_reduce.h> + +/* CUDA implementation of Primal-Dual TV [1] by Chambolle Pock denoising/regularization model (2D/3D case) + * + * Input Parameters: + * 1. Noisy image/volume + * 2. lambdaPar - regularization parameter + * 3. Number of iterations + * 4. eplsilon: tolerance constant + * 5. lipschitz_const: convergence related parameter + * 6. TV-type: methodTV - 'iso' (0) or 'l1' (1) + * 7. nonneg: 'nonnegativity (0 is OFF by default, 1 is ON) + * 8. tau: time marching parameter + + * Output: + * [1] TV - Filtered/regularized image/volume + * [2] Information vector which contains [iteration no., reached tolerance] + * + * [1] Antonin Chambolle, Thomas Pock. "A First-Order Primal-Dual Algorithm for Convex Problems with Applications to Imaging", 2010 + */ + +#define BLKXSIZE2D 16 +#define BLKYSIZE2D 16 + +#define BLKXSIZE 8 +#define BLKYSIZE 8 +#define BLKZSIZE 8 + +#define idivup(a, b) ( ((a)%(b) != 0) ? (a)/(b)+1 : (a)/(b) ) +// struct square { __host__ __device__ float operator()(float x) { return x * x; } }; + +/************************************************/ +/*****************2D modules*********************/ +/************************************************/ + +__global__ void dualPD_kernel(float *U, float *P1, float *P2, float sigma, int N, int M) +{ + + //calculate each thread global index + const int xIndex=blockIdx.x*blockDim.x+threadIdx.x; + const int yIndex=blockIdx.y*blockDim.y+threadIdx.y; + + int index = xIndex + N*yIndex; + + if ((xIndex < N) && (yIndex < M)) { + if (xIndex == N-1) P1[index] += sigma*(U[(xIndex-1) + N*yIndex] - U[index]); + else P1[index] += sigma*(U[(xIndex+1) + N*yIndex] - U[index]); + if (yIndex == M-1) P2[index] += sigma*(U[xIndex + N*(yIndex-1)] - U[index]); + else P2[index] += sigma*(U[xIndex + N*(yIndex+1)] - U[index]); + } + return; +} +__global__ void Proj_funcPD2D_iso_kernel(float *P1, float *P2, int N, int M, int ImSize) +{ + + float denom; + //calculate each thread global index + const int xIndex=blockIdx.x*blockDim.x+threadIdx.x; + const int yIndex=blockIdx.y*blockDim.y+threadIdx.y; + + int index = xIndex + N*yIndex; + + if ((xIndex < N) && (yIndex < M)) { + denom = pow(P1[index],2) + pow(P2[index],2); + if (denom > 1.0f) { + P1[index] = P1[index]/sqrt(denom); + P2[index] = P2[index]/sqrt(denom); + } + } + return; +} +__global__ void Proj_funcPD2D_aniso_kernel(float *P1, float *P2, int N, int M, int ImSize) +{ + + float val1, val2; + //calculate each thread global index + const int xIndex=blockIdx.x*blockDim.x+threadIdx.x; + const int yIndex=blockIdx.y*blockDim.y+threadIdx.y; + + int index = xIndex + N*yIndex; + + if ((xIndex < N) && (yIndex < M)) { + val1 = abs(P1[index]); + val2 = abs(P2[index]); + if (val1 < 1.0f) {val1 = 1.0f;} + if (val2 < 1.0f) {val2 = 1.0f;} + P1[index] = P1[index]/val1; + P2[index] = P2[index]/val2; + } + return; +} +__global__ void DivProj2D_kernel(float *U, float *Input, float *P1, float *P2, float lt, float tau, int N, int M) +{ + float P_v1, P_v2, div_var; + + //calculate each thread global index + const int xIndex=blockIdx.x*blockDim.x+threadIdx.x; + const int yIndex=blockIdx.y*blockDim.y+threadIdx.y; + + int index = xIndex + N*yIndex; + + if ((xIndex < N) && (yIndex < M)) { + if (xIndex == 0) P_v1 = -P1[index]; + else P_v1 = -(P1[index] - P1[(xIndex-1) + N*yIndex]); + if (yIndex == 0) P_v2 = -P2[index]; + else P_v2 = -(P2[index] - P2[xIndex + N*(yIndex-1)]); + div_var = P_v1 + P_v2; + U[index] = (U[index] - tau*div_var + lt*Input[index])/(1.0 + lt); + } + return; +} +__global__ void PDnonneg2D_kernel(float* Output, int N, int M, int num_total) +{ + int xIndex = blockDim.x * blockIdx.x + threadIdx.x; + int yIndex = blockDim.y * blockIdx.y + threadIdx.y; + + int index = xIndex + N*yIndex; + + if (index < num_total) { + if (Output[index] < 0.0f) Output[index] = 0.0f; + } +} +/************************************************/ +/*****************3D modules*********************/ +/************************************************/ +__global__ void dualPD3D_kernel(float *U, float *P1, float *P2, float *P3, float sigma, int N, int M, int Z) +{ + + //calculate each thread global index + int i = blockDim.x * blockIdx.x + threadIdx.x; + int j = blockDim.y * blockIdx.y + threadIdx.y; + int k = blockDim.z * blockIdx.z + threadIdx.z; + + int index = (N*M)*k + i + N*j; + + if ((i < N) && (j < M) && (k < Z)) { + if (i == N-1) P1[index] += sigma*(U[(N*M)*k + (i-1) + N*j] - U[index]); + else P1[index] += sigma*(U[(N*M)*k + (i+1) + N*j] - U[index]); + if (j == M-1) P2[index] += sigma*(U[(N*M)*k + i + N*(j-1)] - U[index]); + else P2[index] += sigma*(U[(N*M)*k + i + N*(j+1)] - U[index]); + if (k == Z-1) P3[index] += sigma*(U[(N*M)*(k-1) + i + N*j] - U[index]); + else P3[index] += sigma*(U[(N*M)*(k+1) + i + N*j] - U[index]); + } + return; +} +__global__ void Proj_funcPD3D_iso_kernel(float *P1, float *P2, float *P3, int N, int M, int Z, int ImSize) +{ + + float denom,sq_denom; + //calculate each thread global index + int i = blockDim.x * blockIdx.x + threadIdx.x; + int j = blockDim.y * blockIdx.y + threadIdx.y; + int k = blockDim.z * blockIdx.z + threadIdx.z; + + int index = (N*M)*k + i + N*j; + + if ((i < N) && (j < M) && (k < Z)) { + denom = pow(P1[index],2) + pow(P2[index],2) + pow(P3[index],2); + if (denom > 1.0f) { + sq_denom = 1.0f/sqrt(denom); + P1[index] *= sq_denom; + P2[index] *= sq_denom; + P3[index] *= sq_denom; + } + } + return; +} +__global__ void Proj_funcPD3D_aniso_kernel(float *P1, float *P2, float *P3, int N, int M, int Z, int ImSize) +{ + + float val1, val2, val3; + //calculate each thread global index + int i = blockDim.x * blockIdx.x + threadIdx.x; + int j = blockDim.y * blockIdx.y + threadIdx.y; + int k = blockDim.z * blockIdx.z + threadIdx.z; + + int index = (N*M)*k + i + N*j; + + if ((i < N) && (j < M) && (k < Z)) { + val1 = abs(P1[index]); + val2 = abs(P2[index]); + val3 = abs(P3[index]); + if (val1 < 1.0f) {val1 = 1.0f;} + if (val2 < 1.0f) {val2 = 1.0f;} + if (val3 < 1.0f) {val3 = 1.0f;} + P1[index] /= val1; + P2[index] /= val2; + P3[index] /= val3; + } + return; +} +__global__ void DivProj3D_kernel(float *U, float *Input, float *P1, float *P2, float *P3, float lt, float tau, int N, int M, int Z) +{ + float P_v1, P_v2, P_v3, div_var; + + //calculate each thread global index + int i = blockDim.x * blockIdx.x + threadIdx.x; + int j = blockDim.y * blockIdx.y + threadIdx.y; + int k = blockDim.z * blockIdx.z + threadIdx.z; + + int index = (N*M)*k + i + N*j; + + if ((i < N) && (j < M) && (k < Z)) { + if (i == 0) P_v1 = -P1[index]; + else P_v1 = -(P1[index] - P1[(N*M)*k + (i-1) + N*j]); + if (j == 0) P_v2 = -P2[index]; + else P_v2 = -(P2[index] - P2[(N*M)*k + i + N*(j-1)]); + if (k == 0) P_v3 = -P3[index]; + else P_v3 = -(P3[index] - P3[(N*M)*(k-1) + i + N*j]); + div_var = P_v1 + P_v2 + P_v3; + U[index] = (U[index] - tau*div_var + lt*Input[index])/(1.0 + lt); + } + return; +} + +__global__ void PDnonneg3D_kernel(float* Output, int N, int M, int Z, int num_total) +{ + int i = blockDim.x * blockIdx.x + threadIdx.x; + int j = blockDim.y * blockIdx.y + threadIdx.y; + int k = blockDim.z * blockIdx.z + threadIdx.z; + + int index = (N*M)*k + i + N*j; + + if (index < num_total) { + if (Output[index] < 0.0f) Output[index] = 0.0f; + } +} +__global__ void PDcopy_kernel2D(float *Input, float* Output, int N, int M, int num_total) +{ + int xIndex = blockDim.x * blockIdx.x + threadIdx.x; + int yIndex = blockDim.y * blockIdx.y + threadIdx.y; + + int index = xIndex + N*yIndex; + + if (index < num_total) { + Output[index] = Input[index]; + } +} + +__global__ void PDcopy_kernel3D(float *Input, float* Output, int N, int M, int Z, int num_total) +{ + int i = blockDim.x * blockIdx.x + threadIdx.x; + int j = blockDim.y * blockIdx.y + threadIdx.y; + int k = blockDim.z * blockIdx.z + threadIdx.z; + + int index = (N*M)*k + i + N*j; + + if (index < num_total) { + Output[index] = Input[index]; + } +} + +__global__ void getU2D_kernel(float *Input, float *Input_old, float theta, int N, int M, int num_total) +{ + int xIndex = blockDim.x * blockIdx.x + threadIdx.x; + int yIndex = blockDim.y * blockIdx.y + threadIdx.y; + + int index = xIndex + N*yIndex; + + if (index < num_total) { + Input[index] += theta*(Input[index] - Input_old[index]); + } +} + +__global__ void getU3D_kernel(float *Input, float *Input_old, float theta, int N, int M, int Z, int num_total) +{ + int i = blockDim.x * blockIdx.x + threadIdx.x; + int j = blockDim.y * blockIdx.y + threadIdx.y; + int k = blockDim.z * blockIdx.z + threadIdx.z; + + int index = (N*M)*k + i + N*j; + + if (index < num_total) { + Input[index] += theta*(Input[index] - Input_old[index]); + } +} + +__global__ void PDResidCalc2D_kernel(float *Input1, float *Input2, float* Output, int N, int M, int num_total) +{ + int xIndex = blockDim.x * blockIdx.x + threadIdx.x; + int yIndex = blockDim.y * blockIdx.y + threadIdx.y; + + int index = xIndex + N*yIndex; + + if (index < num_total) { + Output[index] = Input1[index] - Input2[index]; + } +} + +__global__ void PDResidCalc3D_kernel(float *Input1, float *Input2, float* Output, int N, int M, int Z, int num_total) +{ + int i = blockDim.x * blockIdx.x + threadIdx.x; + int j = blockDim.y * blockIdx.y + threadIdx.y; + int k = blockDim.z * blockIdx.z + threadIdx.z; + + int index = (N*M)*k + i + N*j; + + if (index < num_total) { + Output[index] = Input1[index] - Input2[index]; + } +} + + +/*%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%*/ + +////////////MAIN HOST FUNCTION /////////////// +extern "C" int TV_PD_GPU_main(float *Input, float *Output, float *infovector, float lambdaPar, int iter, float epsil, float lipschitz_const, int methodTV, int nonneg, float tau, int dimX, int dimY, int dimZ) +{ + int deviceCount = -1; // number of devices + cudaGetDeviceCount(&deviceCount); + if (deviceCount == 0) { + fprintf(stderr, "No CUDA devices found\n"); + return -1; + } + int count = 0, i; + float re, sigma, theta, lt; + re = 0.0f; + + sigma = 1.0/(lipschitz_const*tau); + theta = 1.0f; + lt = tau/lambdaPar; + + if (dimZ <= 1) { + /*2D verson*/ + int ImSize = dimX*dimY; + float *d_input, *d_update, *d_old=NULL, *P1=NULL, *P2=NULL; + + dim3 dimBlock(BLKXSIZE2D,BLKYSIZE2D); + dim3 dimGrid(idivup(dimX,BLKXSIZE2D), idivup(dimY,BLKYSIZE2D)); + + /*allocate space for images on device*/ + checkCudaErrors( cudaMalloc((void**)&d_input,ImSize*sizeof(float)) ); + checkCudaErrors( cudaMalloc((void**)&d_update,ImSize*sizeof(float)) ); + checkCudaErrors( cudaMalloc((void**)&d_old,ImSize*sizeof(float)) ); + checkCudaErrors( cudaMalloc((void**)&P1,ImSize*sizeof(float)) ); + checkCudaErrors( cudaMalloc((void**)&P2,ImSize*sizeof(float)) ); + + checkCudaErrors( cudaMemcpy(d_input,Input,ImSize*sizeof(float),cudaMemcpyHostToDevice)); + checkCudaErrors( cudaMemcpy(d_update,Input,ImSize*sizeof(float),cudaMemcpyHostToDevice)); + cudaMemset(P1, 0, ImSize*sizeof(float)); + cudaMemset(P2, 0, ImSize*sizeof(float)); + + /********************** Run CUDA 2D kernel here ********************/ + /* The main kernel */ + for (i = 0; i < iter; i++) { + + /* computing the the dual P variable */ + dualPD_kernel<<<dimGrid,dimBlock>>>(d_update, P1, P2, sigma, dimX, dimY); + checkCudaErrors( cudaDeviceSynchronize() ); + checkCudaErrors(cudaPeekAtLastError() ); + + if (nonneg != 0) { + PDnonneg2D_kernel<<<dimGrid,dimBlock>>>(d_update, dimX, dimY, ImSize); + checkCudaErrors( cudaDeviceSynchronize() ); + checkCudaErrors(cudaPeekAtLastError() ); } + + /* projection step */ + if (methodTV == 0) Proj_funcPD2D_iso_kernel<<<dimGrid,dimBlock>>>(P1, P2, dimX, dimY, ImSize); /*isotropic TV*/ + else Proj_funcPD2D_aniso_kernel<<<dimGrid,dimBlock>>>(P1, P2, dimX, dimY, ImSize); /*anisotropic TV*/ + checkCudaErrors( cudaDeviceSynchronize() ); + checkCudaErrors(cudaPeekAtLastError() ); + + /* copy U to U_old */ + PDcopy_kernel2D<<<dimGrid,dimBlock>>>(d_update, d_old, dimX, dimY, ImSize); + checkCudaErrors( cudaDeviceSynchronize() ); + checkCudaErrors(cudaPeekAtLastError() ); + + /* calculate divergence */ + DivProj2D_kernel<<<dimGrid,dimBlock>>>(d_update, d_input, P1, P2, lt, tau, dimX, dimY); + checkCudaErrors( cudaDeviceSynchronize() ); + checkCudaErrors(cudaPeekAtLastError() ); + + if ((epsil != 0.0f) && (i % 5 == 0)) { + /* calculate norm - stopping rules using the Thrust library */ + PDResidCalc2D_kernel<<<dimGrid,dimBlock>>>(d_update, d_old, P1, dimX, dimY, ImSize); + checkCudaErrors( cudaDeviceSynchronize() ); + checkCudaErrors(cudaPeekAtLastError() ); + + // setup arguments + square<float> unary_op; + thrust::plus<float> binary_op; + thrust::device_vector<float> d_vec(P1, P1 + ImSize); + float reduction = std::sqrt(thrust::transform_reduce(d_vec.begin(), d_vec.end(), unary_op, 0.0f, binary_op)); + thrust::device_vector<float> d_vec2(d_update, d_update + ImSize); + float reduction2 = std::sqrt(thrust::transform_reduce(d_vec2.begin(), d_vec2.end(), unary_op, 0.0f, binary_op)); + + // compute norm + re = (reduction/reduction2); + if (re < epsil) count++; + if (count > 3) break; + } + + getU2D_kernel<<<dimGrid,dimBlock>>>(d_update, d_old, theta, dimX, dimY, ImSize); + checkCudaErrors( cudaDeviceSynchronize() ); + checkCudaErrors(cudaPeekAtLastError() ); + } + //copy result matrix from device to host memory + cudaMemcpy(Output,d_update,ImSize*sizeof(float),cudaMemcpyDeviceToHost); + + cudaFree(d_input); + cudaFree(d_update); + cudaFree(d_old); + cudaFree(P1); + cudaFree(P2); + + } + else { + /*3D verson*/ + int ImSize = dimX*dimY*dimZ; + float *d_input, *d_update, *d_old=NULL, *P1=NULL, *P2=NULL, *P3=NULL; + + dim3 dimBlock(BLKXSIZE,BLKYSIZE,BLKZSIZE); + dim3 dimGrid(idivup(dimX,BLKXSIZE), idivup(dimY,BLKYSIZE),idivup(dimZ,BLKZSIZE)); + + /*allocate space for images on device*/ + checkCudaErrors( cudaMalloc((void**)&d_input,ImSize*sizeof(float)) ); + checkCudaErrors( cudaMalloc((void**)&d_update,ImSize*sizeof(float)) ); + checkCudaErrors( cudaMalloc((void**)&d_old,ImSize*sizeof(float)) ); + checkCudaErrors( cudaMalloc((void**)&P1,ImSize*sizeof(float)) ); + checkCudaErrors( cudaMalloc((void**)&P2,ImSize*sizeof(float)) ); + checkCudaErrors( cudaMalloc((void**)&P3,ImSize*sizeof(float)) ); + + checkCudaErrors( cudaMemcpy(d_input,Input,ImSize*sizeof(float),cudaMemcpyHostToDevice)); + checkCudaErrors( cudaMemcpy(d_update,Input,ImSize*sizeof(float),cudaMemcpyHostToDevice)); + cudaMemset(P1, 0, ImSize*sizeof(float)); + cudaMemset(P2, 0, ImSize*sizeof(float)); + cudaMemset(P3, 0, ImSize*sizeof(float)); + /********************** Run CUDA 3D kernel here ********************/ + for (i = 0; i < iter; i++) { + + /* computing the the dual P variable */ + dualPD3D_kernel<<<dimGrid,dimBlock>>>(d_update, P1, P2, P3, sigma, dimX, dimY, dimZ); + checkCudaErrors( cudaDeviceSynchronize() ); + checkCudaErrors(cudaPeekAtLastError() ); + + if (nonneg != 0) { + PDnonneg3D_kernel<<<dimGrid,dimBlock>>>(d_update, dimX, dimY, dimZ, ImSize); + checkCudaErrors( cudaDeviceSynchronize() ); + checkCudaErrors(cudaPeekAtLastError() ); } + + /* projection step */ + if (methodTV == 0) Proj_funcPD3D_iso_kernel<<<dimGrid,dimBlock>>>(P1, P2, P3, dimX, dimY, dimZ, ImSize); /*isotropic TV*/ + else Proj_funcPD3D_aniso_kernel<<<dimGrid,dimBlock>>>(P1, P2, P3, dimX, dimY, dimZ, ImSize); /*anisotropic TV*/ + checkCudaErrors( cudaDeviceSynchronize() ); + checkCudaErrors(cudaPeekAtLastError() ); + + /* copy U to U_old */ + PDcopy_kernel3D<<<dimGrid,dimBlock>>>(d_update, d_old, dimX, dimY, dimZ, ImSize); + checkCudaErrors( cudaDeviceSynchronize() ); + checkCudaErrors(cudaPeekAtLastError() ); + + /* calculate divergence */ + DivProj3D_kernel<<<dimGrid,dimBlock>>>(d_update, d_input, P1, P2, P3, lt, tau, dimX, dimY, dimZ); + checkCudaErrors( cudaDeviceSynchronize() ); + checkCudaErrors(cudaPeekAtLastError() ); + + if ((epsil != 0.0f) && (i % 5 == 0)) { + /* calculate norm - stopping rules using the Thrust library */ + PDResidCalc3D_kernel<<<dimGrid,dimBlock>>>(d_update, d_old, P1, dimX, dimY, dimZ, ImSize); + checkCudaErrors( cudaDeviceSynchronize() ); + checkCudaErrors(cudaPeekAtLastError() ); + + // setup arguments + square<float> unary_op; + thrust::plus<float> binary_op; + thrust::device_vector<float> d_vec(P1, P1 + ImSize); + float reduction = std::sqrt(thrust::transform_reduce(d_vec.begin(), d_vec.end(), unary_op, 0.0f, binary_op)); + thrust::device_vector<float> d_vec2(d_update, d_update + ImSize); + float reduction2 = std::sqrt(thrust::transform_reduce(d_vec2.begin(), d_vec2.end(), unary_op, 0.0f, binary_op)); + + // compute norm + re = (reduction/reduction2); + if (re < epsil) count++; + if (count > 3) break; + } + + /* get U*/ + getU3D_kernel<<<dimGrid,dimBlock>>>(d_update, d_old, theta, dimX, dimY, dimZ, ImSize); + checkCudaErrors( cudaDeviceSynchronize() ); + checkCudaErrors(cudaPeekAtLastError() ); + } + /***************************************************************/ + //copy result matrix from device to host memory + cudaMemcpy(Output,d_update,ImSize*sizeof(float),cudaMemcpyDeviceToHost); + + cudaFree(d_input); + cudaFree(d_update); + cudaFree(d_old); + cudaFree(P1); + cudaFree(P2); + cudaFree(P3); + + } + //cudaDeviceReset(); + /*adding info into info_vector */ + infovector[0] = (float)(i); /*iterations number (if stopped earlier based on tolerance)*/ + infovector[1] = re; /* reached tolerance */ + return 0; +} diff --git a/src/Core/regularisers_GPU/TV_PD_GPU_core.h b/src/Core/regularisers_GPU/TV_PD_GPU_core.h new file mode 100644 index 0000000..2b123d9 --- /dev/null +++ b/src/Core/regularisers_GPU/TV_PD_GPU_core.h @@ -0,0 +1,9 @@ +#ifndef _TV_PD_GPU_ +#define _TV_PD_GPU_ + +#include "CCPiDefines.h" +#include <memory.h> + +extern "C" CCPI_EXPORT int TV_PD_GPU_main(float *Input, float *Output, float *infovector, float lambdaPar, int iter, float epsil, float lipschitz_const, int methodTV, int nonneg, float tau, int dimX, int dimY, int dimZ); + +#endif diff --git a/src/Matlab/mex_compile/compileCPU_mex_Linux.m b/src/Matlab/mex_compile/compileCPU_mex_Linux.m index 2ed7ea6..5330d7f 100644 --- a/src/Matlab/mex_compile/compileCPU_mex_Linux.m +++ b/src/Matlab/mex_compile/compileCPU_mex_Linux.m @@ -28,6 +28,10 @@ movefile('FGP_TV.mex*',Pathmove); fprintf('%s \n', 'Compiling SB-TV...'); mex SB_TV.c SB_TV_core.c utils.c CFLAGS="\$CFLAGS -fopenmp -Wall -std=c99" LDFLAGS="\$LDFLAGS -fopenmp" movefile('SB_TV.mex*',Pathmove); + +fprintf('%s \n', 'Compiling PD-TV...'); +mex PD_TV.c PD_TV_core.c utils.c CFLAGS="\$CFLAGS -fopenmp -Wall -std=c99" LDFLAGS="\$LDFLAGS -fopenmp" +movefile('PD_TV.mex*',Pathmove); fprintf('%s \n', 'Compiling dFGP-TV...'); mex FGP_dTV.c FGP_dTV_core.c utils.c CFLAGS="\$CFLAGS -fopenmp -Wall -std=c99" LDFLAGS="\$LDFLAGS -fopenmp" @@ -75,7 +79,8 @@ movefile('NonlocalMarching_Inpaint.mex*',Pathmove); delete SB_TV_core* ROF_TV_core* FGP_TV_core* FGP_dTV_core* TNV_core* utils* Diffusion_core* Diffus4th_order_core* TGV_core* LLT_ROF_core* CCPiDefines.h delete PatchSelect_core* Nonlocal_TV_core* delete Diffusion_Inpaint_core* NonlocalMarching_Inpaint_core* -fprintf('%s \n', '<<<<<<< Regularisers successfully compiled! >>>>>>>'); +delete PD_TV_core* +fprintf('%s \n', '<<<<<<< CPU regularisers were successfully compiled! >>>>>>>'); pathA2 = sprintf(['..' fsep '..' fsep '..' fsep '..' fsep 'demos' fsep 'Matlab_demos'], 1i); cd(pathA2);
\ No newline at end of file diff --git a/src/Matlab/mex_compile/compileGPU_mex.m b/src/Matlab/mex_compile/compileGPU_mex.m index 56fcd38..577630f 100644 --- a/src/Matlab/mex_compile/compileGPU_mex.m +++ b/src/Matlab/mex_compile/compileGPU_mex.m @@ -41,6 +41,11 @@ fprintf('%s \n', 'Compiling SB-TV...'); mex -g -I/usr/local/cuda-10.0/include -L/usr/local/cuda-10.0/lib64 -lcudart -lcufft -lmwgpu SB_TV_GPU.cpp TV_SB_GPU_core.o movefile('SB_TV_GPU.mex*',Pathmove); +fprintf('%s \n', 'Compiling PD-TV...'); +!/usr/local/cuda/bin/nvcc -O0 -c TV_PD_GPU_core.cu -Xcompiler -fPIC -I~/SOFT/MATLAB9/extern/include/ +mex -g -I/usr/local/cuda-10.0/include -L/usr/local/cuda-10.0/lib64 -lcudart -lcufft -lmwgpu PD_TV_GPU.cpp TV_PD_GPU_core.o +movefile('PD_TV_GPU.mex*',Pathmove); + fprintf('%s \n', 'Compiling TGV...'); !/usr/local/cuda/bin/nvcc -O0 -c TGV_GPU_core.cu -Xcompiler -fPIC -I~/SOFT/MATLAB9/extern/include/ mex -g -I/usr/local/cuda-10.0/include -L/usr/local/cuda-10.0/lib64 -lcudart -lcufft -lmwgpu TGV_GPU.cpp TGV_GPU_core.o @@ -72,6 +77,7 @@ mex -g -I/usr/local/cuda-10.0/include -L/usr/local/cuda-10.0/lib64 -lcudart -lcu movefile('PatchSelect_GPU.mex*',Pathmove); delete TV_ROF_GPU_core* TV_FGP_GPU_core* TV_SB_GPU_core* dTV_FGP_GPU_core* NonlDiff_GPU_core* Diffus_4thO_GPU_core* TGV_GPU_core* LLT_ROF_GPU_core* CCPiDefines.h +delete TV_PD_GPU_core* delete PatchSelect_GPU_core* Nonlocal_TV_core* shared.h fprintf('%s \n', 'All successfully compiled!'); diff --git a/src/Matlab/mex_compile/regularisers_CPU/PD_TV.c b/src/Matlab/mex_compile/regularisers_CPU/PD_TV.c new file mode 100644 index 0000000..e5ab1e4 --- /dev/null +++ b/src/Matlab/mex_compile/regularisers_CPU/PD_TV.c @@ -0,0 +1,100 @@ +/* + * This work is part of the Core Imaging Library developed by + * Visual Analytics and Imaging System Group of the Science Technology + * Facilities Council, STFC + * + * Copyright 2019 Daniil Kazantsev + * Copyright 2019 Srikanth Nagella, Edoardo Pasca + * + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * http://www.apache.org/licenses/LICENSE-2.0 + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +#include "matrix.h" +#include "mex.h" +#include "PD_TV_core.h" + +/* C-OMP implementation of Primal-Dual TV [1] by Chambolle Pock denoising/regularization model (2D/3D case) + * + * Input Parameters: + * 1. Noisy image/volume + * 2. lambdaPar - regularization parameter + * 3. Number of iterations + * 4. eplsilon: tolerance constant + * 5. TV-type: methodTV - 'iso' (0) or 'l1' (1) + * 6. nonneg: 'nonnegativity (0 is OFF by default, 1 is ON) + * 7. lipschitz_const: convergence related parameter + * 8. tau: convergence related parameter + + * Output: + * [1] TV - Filtered/regularized image/volume + * [2] Information vector which contains [iteration no., reached tolerance] + * + * [1] Antonin Chambolle, Thomas Pock. "A First-Order Primal-Dual Algorithm for Convex Problems with Applications to Imaging", 2010 + */ +void mexFunction( + int nlhs, mxArray *plhs[], + int nrhs, const mxArray *prhs[]) + +{ + int number_of_dims, iter, methTV, nonneg; + mwSize dimX, dimY, dimZ; + const mwSize *dim_array; + float *Input, *infovec=NULL, *Output=NULL, lambda, epsil, lipschitz_const, tau; + + number_of_dims = mxGetNumberOfDimensions(prhs[0]); + dim_array = mxGetDimensions(prhs[0]); + + /*Handling Matlab input data*/ + if ((nrhs < 2) || (nrhs > 7)) mexErrMsgTxt("At least 2 parameters is required, all parameters are: Image(2D/3D), Regularization parameter, iterations number, tolerance, penalty type ('iso' or 'l1'), nonnegativity switch, lipschitz_const"); + + Input = (float *) mxGetData(prhs[0]); /*noisy image (2D/3D) */ + lambda = (float) mxGetScalar(prhs[1]); /* regularization parameter */ + iter = 500; /* default iterations number */ + epsil = 1.0e-06; /* default tolerance constant */ + methTV = 0; /* default isotropic TV penalty */ + nonneg = 0; /* default nonnegativity switch, off - 0 */ + lipschitz_const = 8.0; /* lipschitz_const */ + tau = 0.0025; /* tau convergence const */ + + if (mxGetClassID(prhs[0]) != mxSINGLE_CLASS) {mexErrMsgTxt("The input image must be in a single precision"); } + + if ((nrhs == 3) || (nrhs == 4) || (nrhs == 5) || (nrhs == 6) || (nrhs == 7) || (nrhs == 8)) iter = (int) mxGetScalar(prhs[2]); /* iterations number */ + if ((nrhs == 4) || (nrhs == 5) || (nrhs == 6) || (nrhs == 7) || (nrhs == 8)) epsil = (float) mxGetScalar(prhs[3]); /* tolerance constant */ + if ((nrhs == 5) || (nrhs == 6) || (nrhs == 7) || (nrhs == 8)) { + char *penalty_type; + penalty_type = mxArrayToString(prhs[4]); /* choosing TV penalty: 'iso' or 'l1', 'iso' is the default */ + if ((strcmp(penalty_type, "l1") != 0) && (strcmp(penalty_type, "iso") != 0)) mexErrMsgTxt("Choose TV type: 'iso' or 'l1',"); + if (strcmp(penalty_type, "l1") == 0) methTV = 1; /* enable 'l1' penalty */ + mxFree(penalty_type); + } + if ((nrhs == 6) || (nrhs == 7) || (nrhs == 8)) { + nonneg = (int) mxGetScalar(prhs[5]); + if ((nonneg != 0) && (nonneg != 1)) mexErrMsgTxt("Nonnegativity constraint can be enabled by choosing 1 or off - 0"); + } + if ((nrhs == 7) || (nrhs == 8)) lipschitz_const = (float) mxGetScalar(prhs[6]); + if (nrhs == 8) tau = (float) mxGetScalar(prhs[7]); + + /*Handling Matlab output data*/ + dimX = dim_array[0]; dimY = dim_array[1]; dimZ = dim_array[2]; + + if (number_of_dims == 2) { + dimZ = 1; /*2D case*/ + Output = (float*)mxGetPr(plhs[0] = mxCreateNumericArray(2, dim_array, mxSINGLE_CLASS, mxREAL)); + } + if (number_of_dims == 3) { + Output = (float*)mxGetPr(plhs[0] = mxCreateNumericArray(3, dim_array, mxSINGLE_CLASS, mxREAL)); + } + mwSize vecdim[1]; + vecdim[0] = 2; + infovec = (float*)mxGetPr(plhs[1] = mxCreateNumericArray(1, vecdim, mxSINGLE_CLASS, mxREAL)); + + /* running the function */ + PDTV_CPU_main(Input, Output, infovec, lambda, iter, epsil, lipschitz_const, methTV, nonneg, tau, dimX, dimY, dimZ); +} diff --git a/src/Matlab/mex_compile/regularisers_GPU/PD_TV_GPU.cpp b/src/Matlab/mex_compile/regularisers_GPU/PD_TV_GPU.cpp new file mode 100644 index 0000000..e853dd3 --- /dev/null +++ b/src/Matlab/mex_compile/regularisers_GPU/PD_TV_GPU.cpp @@ -0,0 +1,101 @@ +/* + * This work is part of the Core Imaging Library developed by + * Visual Analytics and Imaging System Group of the Science Technology + * Facilities Council, STFC + * + * Copyright 2019 Daniil Kazantsev + * Copyright 2019 Srikanth Nagella, Edoardo Pasca + * + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * http://www.apache.org/licenses/LICENSE-2.0 + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +#include "matrix.h" +#include "mex.h" +#include "TV_PD_GPU_core.h" + +/* GPU implementation of Primal-Dual TV [1] by Chambolle Pock denoising/regularization model (2D/3D case) + * + * Input Parameters: + * 1. Noisy image/volume + * 2. lambdaPar - regularization parameter + * 3. Number of iterations + * 4. eplsilon: tolerance constant + * 5. TV-type: methodTV - 'iso' (0) or 'l1' (1) + * 6. nonneg: 'nonnegativity (0 is OFF by default, 1 is ON) + * 7. lipschitz_const: convergence related parameter + * 8. tau: convergence related parameter + + * Output: + * [1] TV - Filtered/regularized image/volume + * [2] Information vector which contains [iteration no., reached tolerance] + * + * [1] Antonin Chambolle, Thomas Pock. "A First-Order Primal-Dual Algorithm for Convex Problems with Applications to Imaging", 2010 + */ + +void mexFunction( + int nlhs, mxArray *plhs[], + int nrhs, const mxArray *prhs[]) + +{ + int number_of_dims, iter, methTV, nonneg; + mwSize dimX, dimY, dimZ; + const mwSize *dim_array; + float *Input, *infovec=NULL, *Output=NULL, lambda, epsil, lipschitz_const, tau; + + number_of_dims = mxGetNumberOfDimensions(prhs[0]); + dim_array = mxGetDimensions(prhs[0]); + + /*Handling Matlab input data*/ + if ((nrhs < 2) || (nrhs > 7)) mexErrMsgTxt("At least 2 parameters is required, all parameters are: Image(2D/3D), Regularization parameter, iterations number, tolerance, penalty type ('iso' or 'l1'), nonnegativity switch, lipschitz_const"); + + Input = (float *) mxGetData(prhs[0]); /*noisy image (2D/3D) */ + lambda = (float) mxGetScalar(prhs[1]); /* regularization parameter */ + iter = 500; /* default iterations number */ + epsil = 1.0e-06; /* default tolerance constant */ + methTV = 0; /* default isotropic TV penalty */ + nonneg = 0; /* default nonnegativity switch, off - 0 */ + lipschitz_const = 8.0; /* lipschitz_const */ + tau = 0.0025; /* tau convergence const */ + + if (mxGetClassID(prhs[0]) != mxSINGLE_CLASS) {mexErrMsgTxt("The input image must be in a single precision"); } + + if ((nrhs == 3) || (nrhs == 4) || (nrhs == 5) || (nrhs == 6) || (nrhs == 7) || (nrhs == 8)) iter = (int) mxGetScalar(prhs[2]); /* iterations number */ + if ((nrhs == 4) || (nrhs == 5) || (nrhs == 6) || (nrhs == 7) || (nrhs == 8)) epsil = (float) mxGetScalar(prhs[3]); /* tolerance constant */ + if ((nrhs == 5) || (nrhs == 6) || (nrhs == 7) || (nrhs == 8)) { + char *penalty_type; + penalty_type = mxArrayToString(prhs[4]); /* choosing TV penalty: 'iso' or 'l1', 'iso' is the default */ + if ((strcmp(penalty_type, "l1") != 0) && (strcmp(penalty_type, "iso") != 0)) mexErrMsgTxt("Choose TV type: 'iso' or 'l1',"); + if (strcmp(penalty_type, "l1") == 0) methTV = 1; /* enable 'l1' penalty */ + mxFree(penalty_type); + } + if ((nrhs == 6) || (nrhs == 7) || (nrhs == 8)) { + nonneg = (int) mxGetScalar(prhs[5]); + if ((nonneg != 0) && (nonneg != 1)) mexErrMsgTxt("Nonnegativity constraint can be enabled by choosing 1 or off - 0"); + } + if ((nrhs == 7) || (nrhs == 8)) lipschitz_const = (float) mxGetScalar(prhs[6]); + if (nrhs == 8) tau = (float) mxGetScalar(prhs[7]); + + /*Handling Matlab output data*/ + dimX = dim_array[0]; dimY = dim_array[1]; dimZ = dim_array[2]; + + if (number_of_dims == 2) { + dimZ = 1; /*2D case*/ + Output = (float*)mxGetPr(plhs[0] = mxCreateNumericArray(2, dim_array, mxSINGLE_CLASS, mxREAL)); + } + if (number_of_dims == 3) { + Output = (float*)mxGetPr(plhs[0] = mxCreateNumericArray(3, dim_array, mxSINGLE_CLASS, mxREAL)); + } + mwSize vecdim[1]; + vecdim[0] = 2; + infovec = (float*)mxGetPr(plhs[1] = mxCreateNumericArray(1, vecdim, mxSINGLE_CLASS, mxREAL)); + + /* running the function */ + TV_PD_GPU_main(Input, Output, infovec, lambda, iter, epsil, lipschitz_const, methTV, nonneg, tau, dimX, dimY, dimZ); +} diff --git a/src/Python/ccpi/filters/regularisers.py b/src/Python/ccpi/filters/regularisers.py index 0b5b2ee..5f4001a 100644 --- a/src/Python/ccpi/filters/regularisers.py +++ b/src/Python/ccpi/filters/regularisers.py @@ -2,9 +2,9 @@ script which assigns a proper device core function based on a flag ('cpu' or 'gpu') """ -from ccpi.filters.cpu_regularisers import TV_ROF_CPU, TV_FGP_CPU, TV_SB_CPU, dTV_FGP_CPU, TNV_CPU, NDF_CPU, Diff4th_CPU, TGV_CPU, LLT_ROF_CPU, PATCHSEL_CPU, NLTV_CPU +from ccpi.filters.cpu_regularisers import TV_ROF_CPU, TV_FGP_CPU, TV_PD_CPU, TV_SB_CPU, dTV_FGP_CPU, TNV_CPU, NDF_CPU, Diff4th_CPU, TGV_CPU, LLT_ROF_CPU, PATCHSEL_CPU, NLTV_CPU try: - from ccpi.filters.gpu_regularisers import TV_ROF_GPU, TV_FGP_GPU, TV_SB_GPU, dTV_FGP_GPU, NDF_GPU, Diff4th_GPU, TGV_GPU, LLT_ROF_GPU, PATCHSEL_GPU + from ccpi.filters.gpu_regularisers import TV_ROF_GPU, TV_FGP_GPU, TV_PD_GPU, TV_SB_GPU, dTV_FGP_GPU, NDF_GPU, Diff4th_GPU, TGV_GPU, LLT_ROF_GPU, PATCHSEL_GPU gpu_enabled = True except ImportError: gpu_enabled = False @@ -51,6 +51,33 @@ def FGP_TV(inputData, regularisation_parameter,iterations, raise ValueError ('GPU is not available') raise ValueError('Unknown device {0}. Expecting gpu or cpu'\ .format(device)) + +def PD_TV(inputData, regularisation_parameter, iterations, + tolerance_param, methodTV, nonneg, lipschitz_const, tau, device='cpu'): + if device == 'cpu': + return TV_PD_CPU(inputData, + regularisation_parameter, + iterations, + tolerance_param, + methodTV, + nonneg, + lipschitz_const, + tau) + elif device == 'gpu' and gpu_enabled: + return TV_PD_GPU(inputData, + regularisation_parameter, + iterations, + tolerance_param, + methodTV, + nonneg, + lipschitz_const, + tau) + else: + if not gpu_enabled and device == 'gpu': + raise ValueError ('GPU is not available') + raise ValueError('Unknown device {0}. Expecting gpu or cpu'\ + .format(device)) + def SB_TV(inputData, regularisation_parameter, iterations, tolerance_param, methodTV, device='cpu'): if device == 'cpu': @@ -212,4 +239,3 @@ def NDF_INP(inputData, maskData, regularisation_parameter, edge_parameter, itera def NVM_INP(inputData, maskData, SW_increment, iterations): return NVM_INPAINT_CPU(inputData, maskData, SW_increment, iterations) - diff --git a/src/Python/setup-regularisers.py.in b/src/Python/setup-regularisers.py.in index 4c578e3..c4ad143 100644 --- a/src/Python/setup-regularisers.py.in +++ b/src/Python/setup-regularisers.py.in @@ -8,13 +8,13 @@ from Cython.Distutils import build_ext import os import sys import numpy -import platform +import platform cil_version=os.environ['CIL_VERSION'] if cil_version == '': print("Please set the environmental variable CIL_VERSION") sys.exit(1) - + library_include_path = "" library_lib_path = "" try: @@ -23,7 +23,7 @@ try: except: library_include_path = os.environ['PREFIX']+'/include' pass - + extra_include_dirs = [numpy.get_include(), library_include_path] #extra_library_dirs = [os.path.join(library_include_path, "..", "lib")] extra_compile_args = [] @@ -39,6 +39,7 @@ extra_include_dirs += [os.path.join(".." , "Core"), os.path.join(".." , "Core", "inpainters_CPU"), os.path.join(".." , "Core", "regularisers_GPU" , "TV_FGP" ) , os.path.join(".." , "Core", "regularisers_GPU" , "TV_ROF" ) , + os.path.join(".." , "Core", "regularisers_GPU" , "TV_PD" ) , os.path.join(".." , "Core", "regularisers_GPU" , "TV_SB" ) , os.path.join(".." , "Core", "regularisers_GPU" , "TGV" ) , os.path.join(".." , "Core", "regularisers_GPU" , "LLTROF" ) , @@ -48,12 +49,12 @@ extra_include_dirs += [os.path.join(".." , "Core"), os.path.join(".." , "Core", "regularisers_GPU" , "PatchSelect" ) , "."] -if platform.system() == 'Windows': - extra_compile_args[0:] = ['/DWIN32','/EHsc','/DBOOST_ALL_NO_LIB' , '/openmp' ] +if platform.system() == 'Windows': + extra_compile_args[0:] = ['/DWIN32','/EHsc','/DBOOST_ALL_NO_LIB' , '/openmp' ] else: extra_compile_args = ['-fopenmp','-O2', '-funsigned-char', '-Wall', '-std=c++0x'] extra_libraries += [@EXTRA_OMP_LIB@] - + setup( name='ccpi', description='CCPi Core Imaging Library - Image regularisers', @@ -61,13 +62,13 @@ setup( cmdclass = {'build_ext': build_ext}, ext_modules = [Extension("ccpi.filters.cpu_regularisers", sources=[os.path.join("." , "src", "cpu_regularisers.pyx" ) ], - include_dirs=extra_include_dirs, - library_dirs=extra_library_dirs, - extra_compile_args=extra_compile_args, - libraries=extra_libraries ), - + include_dirs=extra_include_dirs, + library_dirs=extra_library_dirs, + extra_compile_args=extra_compile_args, + libraries=extra_libraries ), + ], - zip_safe = False, + zip_safe = False, packages = {'ccpi', 'ccpi.filters', 'ccpi.supp'}, ) diff --git a/src/Python/src/cpu_regularisers.pyx b/src/Python/src/cpu_regularisers.pyx index 4917d06..8de6aea 100644 --- a/src/Python/src/cpu_regularisers.pyx +++ b/src/Python/src/cpu_regularisers.pyx @@ -20,6 +20,7 @@ cimport numpy as np cdef extern float TV_ROF_CPU_main(float *Input, float *Output, float *infovector, float *lambdaPar, int lambda_is_arr, int iterationsNumb, float tau, float epsil, int dimX, int dimY, int dimZ); cdef extern float TV_FGP_CPU_main(float *Input, float *Output, float *infovector, float lambdaPar, int iterationsNumb, float epsil, int methodTV, int nonneg, int dimX, int dimY, int dimZ); +cdef extern float PDTV_CPU_main(float *Input, float *U, float *infovector, float lambdaPar, int iterationsNumb, float epsil, float lipschitz_const, int methodTV, int nonneg, float tau, int dimX, int dimY, int dimZ); cdef extern float SB_TV_CPU_main(float *Input, float *Output, float *infovector, float mu, int iter, float epsil, int methodTV, int dimX, int dimY, int dimZ); cdef extern float LLT_ROF_CPU_main(float *Input, float *Output, float *infovector, float lambdaROF, float lambdaLLT, int iterationsNumb, float tau, float epsil, int dimX, int dimY, int dimZ); cdef extern float TGV_main(float *Input, float *Output, float *infovector, float lambdaPar, float alpha1, float alpha0, int iterationsNumb, float L2, float epsil, int dimX, int dimY, int dimZ); @@ -58,9 +59,6 @@ def TV_ROF_2D(np.ndarray[np.float32_t, ndim=2, mode="c"] inputData, cdef np.ndarray[np.float32_t, ndim=1, mode="c"] infovec = \ np.ones([2], dtype='float32') - # Run ROF iterations for 2D data - # TV_ROF_CPU_main(&inputData[0,0], &outputData[0,0], &infovec[0], regularisation_parameter, iterationsNumb, marching_step_parameter, tolerance_param, dims[1], dims[0], 1) - # Run ROF iterations for 2D data if isinstance (regularisation_parameter, np.ndarray): reg = regularisation_parameter.copy() TV_ROF_CPU_main(&inputData[0,0], &outputData[0,0], &infovec[0], ®[0,0], 1, iterationsNumb, marching_step_parameter, tolerance_param, dims[1], dims[0], 1) @@ -158,6 +156,75 @@ def TV_FGP_3D(np.ndarray[np.float32_t, ndim=3, mode="c"] inputData, dims[2], dims[1], dims[0]) return (outputData,infovec) +#****************************************************************# +#****************** Total-variation Primal-dual *****************# +#****************************************************************# +def TV_PD_CPU(inputData, regularisation_parameter, iterationsNumb, tolerance_param, methodTV, nonneg, lipschitz_const, tau): + if inputData.ndim == 2: + return TV_PD_2D(inputData, regularisation_parameter, iterationsNumb, tolerance_param, methodTV, nonneg, lipschitz_const, tau) + elif inputData.ndim == 3: + return TV_PD_3D(inputData, regularisation_parameter, iterationsNumb, tolerance_param, methodTV, nonneg, lipschitz_const, tau) + +def TV_PD_2D(np.ndarray[np.float32_t, ndim=2, mode="c"] inputData, + float regularisation_parameter, + int iterationsNumb, + float tolerance_param, + int methodTV, + int nonneg, + float lipschitz_const, + float tau): + + cdef long dims[2] + dims[0] = inputData.shape[0] + dims[1] = inputData.shape[1] + + cdef np.ndarray[np.float32_t, ndim=2, mode="c"] outputData = \ + np.zeros([dims[0],dims[1]], dtype='float32') + + cdef np.ndarray[np.float32_t, ndim=1, mode="c"] infovec = \ + np.ones([2], dtype='float32') + + #/* Run FGP-TV iterations for 2D data */ + PDTV_CPU_main(&inputData[0,0], &outputData[0,0], &infovec[0], regularisation_parameter, + iterationsNumb, + tolerance_param, + lipschitz_const, + methodTV, + nonneg, + tau, + dims[1],dims[0], 1) + return (outputData,infovec) + +def TV_PD_3D(np.ndarray[np.float32_t, ndim=3, mode="c"] inputData, + float regularisation_parameter, + int iterationsNumb, + float tolerance_param, + int methodTV, + int nonneg, + float lipschitz_const, + float tau): + + cdef long dims[3] + dims[0] = inputData.shape[0] + dims[1] = inputData.shape[1] + dims[2] = inputData.shape[2] + + cdef np.ndarray[np.float32_t, ndim=3, mode="c"] outputData = \ + np.zeros([dims[0], dims[1], dims[2]], dtype='float32') + cdef np.ndarray[np.float32_t, ndim=1, mode="c"] infovec = \ + np.zeros([2], dtype='float32') + + #/* Run FGP-TV iterations for 3D data */ + PDTV_CPU_main(&inputData[0,0,0], &outputData[0,0,0], &infovec[0], regularisation_parameter, + iterationsNumb, + tolerance_param, + lipschitz_const, + methodTV, + nonneg, + tau, + dims[2], dims[1], dims[0]) + return (outputData,infovec) + #***************************************************************# #********************** Total-variation SB *********************# #***************************************************************# diff --git a/src/Python/src/gpu_regularisers.pyx b/src/Python/src/gpu_regularisers.pyx index 8cd8c93..b22d15e 100644 --- a/src/Python/src/gpu_regularisers.pyx +++ b/src/Python/src/gpu_regularisers.pyx @@ -22,6 +22,7 @@ CUDAErrorMessage = 'CUDA error' cdef extern int TV_ROF_GPU_main(float* Input, float* Output, float *infovector, float *lambdaPar, int lambda_is_arr, int iter, float tau, float epsil, int N, int M, int Z); cdef extern int TV_FGP_GPU_main(float *Input, float *Output, float *infovector, float lambdaPar, int iter, float epsil, int methodTV, int nonneg, int N, int M, int Z); +cdef extern int TV_PD_GPU_main(float *Input, float *Output, float *infovector, float lambdaPar, int iter, float epsil, float lipschitz_const, int methodTV, int nonneg, float tau, int dimX, int dimY, int dimZ); cdef extern int TV_SB_GPU_main(float *Input, float *Output, float *infovector, float lambdaPar, int iter, float epsil, int methodTV, int N, int M, int Z); cdef extern int LLT_ROF_GPU_main(float *Input, float *Output, float *infovector, float lambdaROF, float lambdaLLT, int iterationsNumb, float tau, float epsil, int N, int M, int Z); cdef extern int TGV_GPU_main(float *Input, float *Output, float *infovector, float lambdaPar, float alpha1, float alpha0, int iterationsNumb, float L2, float epsil, int dimX, int dimY, int dimZ); @@ -70,6 +71,75 @@ def TV_FGP_GPU(inputData, tolerance_param, methodTV, nonneg) +# Total-variation Primal-Dual (PD) +def TV_PD_GPU(inputData, regularisation_parameter, iterationsNumb, tolerance_param, methodTV, nonneg, lipschitz_const, tau): + if inputData.ndim == 2: + return TVPD2D(inputData, regularisation_parameter, iterationsNumb, tolerance_param, methodTV, nonneg, lipschitz_const, tau) + elif inputData.ndim == 3: + return TVPD3D(inputData, regularisation_parameter, iterationsNumb, tolerance_param, methodTV, nonneg, lipschitz_const, tau) + +def TVPD2D(np.ndarray[np.float32_t, ndim=2, mode="c"] inputData, + float regularisation_parameter, + int iterationsNumb, + float tolerance_param, + int methodTV, + int nonneg, + float lipschitz_const, + float tau): + + cdef long dims[2] + dims[0] = inputData.shape[0] + dims[1] = inputData.shape[1] + + cdef np.ndarray[np.float32_t, ndim=2, mode="c"] outputData = \ + np.zeros([dims[0],dims[1]], dtype='float32') + + cdef np.ndarray[np.float32_t, ndim=1, mode="c"] infovec = \ + np.ones([2], dtype='float32') + + if (TV_PD_GPU_main(&inputData[0,0], &outputData[0,0], &infovec[0], regularisation_parameter, + iterationsNumb, + tolerance_param, + lipschitz_const, + methodTV, + nonneg, + tau, + dims[1],dims[0], 1) ==0): + return (outputData,infovec) + else: + raise ValueError(CUDAErrorMessage); + +def TVPD3D(np.ndarray[np.float32_t, ndim=3, mode="c"] inputData, + float regularisation_parameter, + int iterationsNumb, + float tolerance_param, + int methodTV, + int nonneg, + float lipschitz_const, + float tau): + + cdef long dims[3] + dims[0] = inputData.shape[0] + dims[1] = inputData.shape[1] + dims[2] = inputData.shape[2] + + cdef np.ndarray[np.float32_t, ndim=3, mode="c"] outputData = \ + np.zeros([dims[0], dims[1], dims[2]], dtype='float32') + cdef np.ndarray[np.float32_t, ndim=1, mode="c"] infovec = \ + np.zeros([2], dtype='float32') + + if (TV_PD_GPU_main(&inputData[0,0,0], &outputData[0,0,0], &infovec[0], regularisation_parameter, + iterationsNumb, + tolerance_param, + lipschitz_const, + methodTV, + nonneg, + tau, + dims[2], dims[1], dims[0]) ==0): + return (outputData,infovec) + else: + raise ValueError(CUDAErrorMessage); + # Total-variation Split Bregman (SB) def TV_SB_GPU(inputData, regularisation_parameter, @@ -195,7 +265,7 @@ def ROFTV2D(np.ndarray[np.float32_t, ndim=2, mode="c"] inputData, if isinstance (regularisation_parameter, np.ndarray): reg = regularisation_parameter.copy() # Running CUDA code here - if (TV_ROF_GPU_main(&inputData[0,0], &outputData[0,0], &infovec[0], + if (TV_ROF_GPU_main(&inputData[0,0], &outputData[0,0], &infovec[0], ®[0,0], 1, iterations, time_marching_parameter, diff --git a/test/test_CPU_regularisers.py b/test/test_CPU_regularisers.py index 95e2a3f..266ca8a 100644 --- a/test/test_CPU_regularisers.py +++ b/test/test_CPU_regularisers.py @@ -3,7 +3,7 @@ import unittest import os #import timeit import numpy as np -from ccpi.filters.regularisers import FGP_TV, SB_TV, TGV, LLT_ROF, FGP_dTV, NDF, Diff4th, ROF_TV +from ccpi.filters.regularisers import FGP_TV, SB_TV, TGV, LLT_ROF, FGP_dTV, NDF, Diff4th, ROF_TV, PD_TV from testroutines import BinReader, rmse ############################################################################### @@ -39,6 +39,15 @@ class TestRegularisers(unittest.TestCase): self.assertAlmostEqual(rms,0.02,delta=0.01) + def test_PD_TV_CPU(self): + Im,input,ref = self.getPars() + + pd_cpu,info = PD_TV(input, 0.02, 300, 0.0, 0, 0, 8, 0.0025, 'cpu'); + + rms = rmse(Im, pd_cpu) + + self.assertAlmostEqual(rms,0.02,delta=0.01) + def test_TV_ROF_CPU(self): # set parameters Im, input,ref = self.getPars() diff --git a/test/test_run_test.py b/test/test_run_test.py index f27e7fc..1707aec 100755 --- a/test/test_run_test.py +++ b/test/test_run_test.py @@ -2,7 +2,7 @@ import unittest import numpy as np
import os
import timeit
-from ccpi.filters.regularisers import ROF_TV, FGP_TV, SB_TV, TGV, LLT_ROF, FGP_dTV, NDF, Diff4th
+from ccpi.filters.regularisers import ROF_TV, FGP_TV, PD_TV, SB_TV, TGV, LLT_ROF, FGP_dTV, NDF, Diff4th
#from PIL import Image
from testroutines import BinReader, rmse, printParametersToString
@@ -164,6 +164,94 @@ class TestRegularisers(unittest.TestCase): self.assertLessEqual(diff_im.sum() , 1)
+ def test_PD_TV_CPU_vs_GPU(self):
+ print(__name__)
+ #filename = os.path.join("test","lena_gray_512.tif")
+ #plt = TiffReader()
+ filename = os.path.join("test","test_imageLena.bin")
+ plt = BinReader()
+ # read image
+ Im = plt.imread(filename)
+ Im = np.asarray(Im, dtype='float32')
+
+ Im = Im/255
+ perc = 0.05
+ u0 = Im + np.random.normal(loc = 0 ,
+ scale = perc * Im ,
+ size = np.shape(Im))
+ u_ref = Im + np.random.normal(loc = 0 ,
+ scale = 0.01 * Im ,
+ size = np.shape(Im))
+
+ # map the u0 u0->u0>0
+ # f = np.frompyfunc(lambda x: 0 if x < 0 else x, 1,1)
+ u0 = u0.astype('float32')
+ u_ref = u_ref.astype('float32')
+
+ print ("%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%")
+ print ("____________PD-TV bench___________________")
+ print ("%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%")
+
+
+ pars = {'algorithm' : PD_TV, \
+ 'input' : u0,\
+ 'regularisation_parameter':0.02, \
+ 'number_of_iterations' :1500 ,\
+ 'tolerance_constant':0.0,\
+ 'methodTV': 0 ,\
+ 'nonneg': 0,
+ 'lipschitz_const' : 8,
+ 'tau' : 0.0025}
+
+ print ("#############PD TV CPU####################")
+ start_time = timeit.default_timer()
+ (pd_cpu,info_vec_cpu) = PD_TV(pars['input'],
+ pars['regularisation_parameter'],
+ pars['number_of_iterations'],
+ pars['tolerance_constant'],
+ pars['methodTV'],
+ pars['nonneg'],
+ pars['lipschitz_const'],
+ pars['tau'],'cpu')
+
+ rms = rmse(Im, pd_cpu)
+ pars['rmse'] = rms
+
+ txtstr = printParametersToString(pars)
+ txtstr += "%s = %.3fs" % ('elapsed time',timeit.default_timer() - start_time)
+ print (txtstr)
+
+ print ("##############PD TV GPU##################")
+ start_time = timeit.default_timer()
+ try:
+ (pd_gpu,info_vec_gpu) = PD_TV(pars['input'],
+ pars['regularisation_parameter'],
+ pars['number_of_iterations'],
+ pars['tolerance_constant'],
+ pars['methodTV'],
+ pars['nonneg'],
+ pars['lipschitz_const'],
+ pars['tau'],'gpu')
+
+ except ValueError as ve:
+ self.skipTest("Results not comparable. GPU computing error.")
+
+ rms = rmse(Im, pd_gpu)
+ pars['rmse'] = rms
+ pars['algorithm'] = PD_TV
+ txtstr = printParametersToString(pars)
+ txtstr += "%s = %.3fs" % ('elapsed time',timeit.default_timer() - start_time)
+ print (txtstr)
+
+ print ("--------Compare the results--------")
+ tolerance = 1e-05
+ diff_im = np.zeros(np.shape(pd_cpu))
+ diff_im = abs(pd_cpu - pd_gpu)
+ diff_im[diff_im > tolerance] = 1
+
+ self.assertLessEqual(diff_im.sum() , 1)
+
+
def test_SB_TV_CPU_vs_GPU(self):
print(__name__)
#filename = os.path.join("test","lena_gray_512.tif")
|