Commit Graph

40 Commits

Author SHA1 Message Date
Eric van Gyzen
ac818ca644 rtld: pacify -Wmaybe-uninitialized from gcc6
Sponsored by:	Dell EMC Isilon
2019-02-01 23:16:59 +00:00
Michal Meloun
4849c3a570 Improve R_AARCH64_TLSDESC relocation.
The original code did not support dynamically loaded libraries and used
suboptimal access to TLS variables.
New implementation removes lazy resolving of TLS relocation - due to flaw
in TLSDESC design is impossible to switch resolver function at runtime
without expensive locking.

Due to this, 3 specialized resolvers are implemented:
 - load time resolver for TLS relocation from libraries loaded with main
   executable (thus with known TLS offset).
 - resolver for undefined thread weak symbols.
 - slower lazy resolver for dynamically loaded libraries with fast path for
   already resolved symbols.

PR:		228892, 232149, 233204, 232311
MFC after:	2 weeks
Differential Revision:	https://reviews.freebsd.org/D18417
2018-12-15 10:38:07 +00:00
Alex Richardson
903e0ffd07 rtld-elf: compile with WANRS=4 warnings other than -Wcast-align
Reviewed By:	kib
Approved By:	brooks (mentor)
Differential Revision: https://reviews.freebsd.org/D17153
2018-10-29 21:08:19 +00:00
Marius Strobl
41fc6f680b o Let rtld(1) set up psABI user trap handlers prior to executing the
objects' init functions instead of doing the setup via a constructor
  in libc as the init functions may already depend on these handlers
  to be in place. This gets us rid of:
  - the undefined order in which libc constructors as __guard_setup()
    and jemalloc_constructor() are executed WRT __sparc_utrap_setup(),
  - the requirement to link libc last so __sparc_utrap_setup() gets
    called prior to constructors in other libraries (see r122883).
  For static binaries, crt1.o still sets up the user trap handlers.
o Move misplaced prototypes for MD functions in to the MD prototype
  section of rtld.h.
o Sprinkle nitems().
2018-02-03 23:14:11 +00:00
Pedro F. Giffuni
e6209940de libexec: adoption of SPDX licensing ID tags.
Mainly focus on files that use BSD 2-Clause license, however the tool I
was using misidentified many licenses so this was mostly a manual - error
prone - task.

The Software Package Data Exchange (SPDX) group provides a specification
to make it easier for automated tools to detect and summarize well known
opensource licenses. We are gradually adopting the specification, noting
that the tags are considered only advisory and do not, in any way,
superceed or replace the license texts.

No functional change intended.
2017-11-27 15:25:02 +00:00
Konstantin Belousov
e35ddbe448 Implement LD_BIND_NOT knob for rtld.
From the manpage:
When set to a nonempty string, prevents modifications of the PLT slots
when doing bindings.  As result, each call of the PLT-resolved
function is resolved.  In combination with debug output, this provides
complete account of all bind actions at runtime.

Same feature exists on Linux and Solaris.

Sponsored by:	The FreeBSD Foundation
MFC after:	2 weeks
2017-03-15 21:11:57 +00:00
Konstantin Belousov
d27078f990 Adjust r308689 to make rtld compilable with either in-tree or
(hopefully) stock gcc 4.2.1 on i386 and other arches.

In particular:
- Do not use %ebx in the asm constraints on i386, since rtld is
  compiled with -fPIC and gcc cannot handle GOT-base register reload
  (clang and newer gcc can).
- Avoid direct use of [static N] construct in the function
  declaration/definion.  In-tree gcc was patched to support this, but
  stock 4.2.1 cannot handle the feature.

Requested by:	bde
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2016-11-21 14:13:57 +00:00
Konstantin Belousov
4352999e0e Pass CPUID[1] %edx (cpu_feature), %ecx (cpu_feature2) and
CPUID[7].%ebx (cpu_stdext_feature), %ecx (cpu_stdext_feature2) to the
ifunc resolvers on x86.

It is much more clean to use CPUID instruction in usermode to retrieve
this information than to pass AT_HWCAP aux vector from kernel, on
x86.  Still, the change does allow for use of AT_HWCAP on arches where it is
needed, by passing aux array to ifunc_init() initializer which should
prepare arguments for ifunc resolvers.

Current signature for resolvers on x86 is
	func_t iresolve(uint32_t cpu_feature, uint32_t cpu_feature2,
	    uint32_t cpu_stdext_feature, uint32_t cpu_stdext_feature2);
where arguments have identical meaning as the kernel variables of the
same name.  The ABIs allow to use resolvers with the void or shortened
list of arguments.

Reviewed by:	jhb
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D8448
2016-11-15 09:43:26 +00:00
Konstantin Belousov
9fee0541f2 Do not call callbacks for dl_iterate_phdr(3) with the rtld bind and
phdr locks locked.  This allows to call rtld services from the
callback, which is only reasonable for dlopen(path, RTLD_NOLOAD) to
test existence of the library in the image, and for dlsym().  The
later might still be not quite safe, due to the lazy resolution of
filters.

To allow dropping the locks around iteration in dl_iterate_phdr(3), we
insert markers to track current position between relocks.  The global
objects list is converted to tailq and all iterators skip markers,
globallist_next() and globallist_curr() helpers are added.

Reported and tested by:	davide
Reviewed by:	kan
Sponsored by:	The FreeBSD Foundation
MFC after:	3 weeks
2016-01-20 07:21:33 +00:00
Konstantin Belousov
0c4f9ecde3 Change compiler setting to make default visibility of the symbols for
rtld on x86 to be hidden.  This is a micro-optimization, which allows
intrinsic references inside rtld to be handled without indirection
through PLT.  The visibility of rtld symbols for other objects in the
symbol namespace is controlled by a version script.

Reviewed by:	kan, jilles
Sponsored by:	The FreeBSD Foundation
MFC after:	2 weeks
2015-03-29 18:53:21 +00:00
Konstantin Belousov
74b0daf4f9 Optimize r270798, only do the second pass over non-plt relocations
when the first pass found IFUNCs.

Sponsored by:	The FreeBSD Foundation
MFC after:	2 weeks
2014-08-29 10:43:56 +00:00
Konstantin Belousov
14c3564759 IFUNC symbol type shall be processed for non-PLT relocations,
e.g. when a global variable is initialized with a pointer to ifunc.
Add symbol type check and call resolver for STT_GNU_IFUNC symbol types
when processing non-PLT relocations, but only after non-IFUNC
relocations are done.  The two-phase proceessing is required since
resolvers may reference other symbols, which must be ready to use when
resolver calls are done.

Restructure reloc_non_plt() on x86 to call find_symdef() and handle
IFUNC in single place.

For non-x86 reloc_non_plt(), check for call for IFUNC relocation and
do nothing, to avoid processing relocs twice.

PR:	193048
Sponsored by:	The FreeBSD Foundation
MFC after:	2 weeks
2014-08-29 09:29:10 +00:00
Konstantin Belousov
f62651920d Add GNU hash support for rtld.
Based on dragonflybsd support for GNU hash by John Marino <draco marino st>
Reviewed by:	kan
Tested by:	bapt
MFC after:	2 weeks
2012-04-30 13:31:10 +00:00
Konstantin Belousov
082f959ac8 Fix several problems with our ELF filters implementation.
Do not relocate twice an object which happens to be needed by loaded
binary (or dso) and some filtee opened due to symbol resolution when
relocating need objects.  Record the state of the relocation
processing in Obj_Entry and short-circuit relocate_objects() if
current object already processed.

Do not call constructors for filtees loaded during the early
relocation processing before image is initialized enough to run
user-provided code.  Filtees are loaded using dlopen_object(), which
normally performs relocation and initialization.  If filtee is
lazy-loaded during the relocation of dso needed by the main object,
dlopen_object() runs too earlier, when most runtime services are not
yet ready.

Postpone the constructors call to the time when main binary and
depended libraries constructors are run, passing the new flag
RTLD_LO_EARLY to dlopen_object().  Symbol lookups callers inform
symlook_* functions about early stage of initialization with
SYMLOOK_EARLY.  Pass flags through all functions participating in
object relocation.

Use the opportunity and fix flags argument to find_symdef() in
arch-specific reloc.c to use proper name SYMLOOK_IN_PLT instead of
true, which happen to have the same numeric value.

Reported and tested by:	theraven
Reviewed by:	kan
MFC after:	2 weeks
2012-03-20 13:20:49 +00:00
Ed Schouten
581f58e7a3 Remove unneeded dtv variable.
It is only assigned and not used at all. The object files stay identical
when the variables are removed.

Approved by:	kib
2012-01-17 21:55:20 +00:00
Konstantin Belousov
5734c46c68 _rtld_bind() read-locks the bind lock, and possible plt resolution
from the dispatcher would also acquire bind lock in read mode, which
is the supported operation. plt is explicitely designed to allow safe
multithreaded updates, so the shared lock do not cause problems.

The error in r228435 is that it allows read lock acquisition after the
write lock for the bind block.  If we dlopened the shared object that
contains IRELATIVE or jump slot which target is STT_GNU_IFUNC, then
possible recursive plt resolve from the dispatcher would cause it.

Postpone the resolution for irelative/ifunc right before initializers
are called, and drop bind lock around calls to dispatcher.  Use
initlist to iterate over the objects instead of the ->next, due to
drop of the bind lock in iteration.

For i386/reloc.c:reloc_iresolve(), fix calculation of the dispatch
function address for dso, by taking into account possible non-zero
relocbase.

MFC after:	3 weeks
2011-12-14 16:47:53 +00:00
Konstantin Belousov
6be4b69715 Add support for STT_GNU_IFUNC and R_MACHINE_IRELATIVE GNU extensions to
rtld on 386 and amd64. This adds runtime bits neccessary for the use
of the dispatch functions from the dynamically-linked executables and
shared libraries.

To allow use of external references from the dispatch function, resolution
of the R_MACHINE_IRESOLVE relocations in PLT is postponed until GOT entries
for PLT are prepared, and normal resolution of the GOT entries is finished.
Similar to how it is done by GNU, IRELATIVE relocations are resolved in
advance, instead of normal lazy handling for PLT.

Move the init_pltgot() call before the relocations for the object are
processed.

MFC after:	3 weeks
2011-12-12 11:03:14 +00:00
Konstantin Belousov
ef9cbd91d0 Handle the R_386_TLS_TPOFF32 relocation, which is similar to R_386_TLS_TPOFF,
but with negative relocation value.

Found by:	mpfr test suite, pointed to by ale
Reviewed by:	kan
MFC after:	1 week
2011-10-08 12:42:19 +00:00
Konstantin Belousov
8569deaf1c Implement support for ELF filters in rtld. Both normal and auxillary
filters are implemented.

Filtees are loaded on demand, unless LD_LOADFLTR environment variable
is set or -z loadfltr was specified during the linking. This forces
rtld to upgrade read-locked rtld_bind_lock to write lock when it
encounters an object with filter during symbol lookup.

Consolidate common arguments of the symbol lookup functions in the
SymLook structure.  Track the state of the rtld locks in the
RtldLockState structure. Pass local RtldLockState through the rtld
symbol lookup calls to allow lock upgrades.

Reviewed by:	kan
Tested by:	Mykola Dzham <i levsha me>, nwhitehorn (powerpc)
2010-12-25 08:51:20 +00:00
Roman Divacky
1dfdc15bb0 Only use the cache after the early stage of loading. This is
because calling mmap() etc. may use GOT which is not set up
yet. Use calloc() instead of mmap() in cases where this
was the case before (sparc64, powerpc, arm).

Submitted by:	Dimitry Andric (dimitry andric com)
Reviewed by:	kan
Approved by:	ed (mentor)
2010-05-18 08:55:23 +00:00
David Xu
c0d2338cdd Allocate space for thread pointer, this allows thread library to access
its pointer from begin, and simplifies _get_curthread() in libthr.
2006-03-28 06:09:24 +00:00
Alexander Kabaev
0eb88f2029 Implement ELF symbol versioning using GNU semantics. This code aims
to be compatible with symbol versioning support as implemented by
GNU libc and documented by http://people.redhat.com/~drepper/symbol-versioning
and LSB 3.0.

Implement dlvsym() function to allow lookups for a specific version of
a given symbol.
2005-12-18 19:43:33 +00:00
Peter Wemm
3b4399f6a7 Clean out the leftovers from the i386_set_gsbase() TLS conversion.
Like on libthr, there is an i386_set_gsbase() stub implementation here
to avoid libc.so.5 issues.  This should likely be a weak symbol and I
expect this will be fixed soon.

Approved by:	re
2005-06-29 23:15:36 +00:00
David Xu
9b0c632a4c Fix compilation problem. 2005-04-27 13:17:23 +00:00
Peter Wemm
8d598c0d01 Stop calling _amd64_set_gsbase() for COMPAT_32BIT. The amd64 kernel
implements i386_set_gsbase(), so there is no need for the variation.
2005-04-26 20:38:44 +00:00
Peter Wemm
8a477e0a7a Attempt to use i386_set_gsbase(), and gracefully fall back to LDT methods
if the direct access methods are not implemented.
2005-04-14 00:04:50 +00:00
Peter Wemm
24b4ec3d21 The 32 bit compatability ld-elf32.so.1 cannot use i386_set_ldt() when
running on an amd64 kernel.  Use the recently exposed direct %fs/%gs set
routines instead for the TLS setup of 32 bit binaries.
2004-11-06 03:32:07 +00:00
Doug Rabson
017246d02f Add support for Thread Local Storage. 2004-08-03 08:51:00 +00:00
Alexander Kabaev
605f36fc1e No need to zero fill memory, mmapped anonymously. Kernel will
return pre-zeroed pages itself.

Noticed by:     jake
2003-03-14 21:10:13 +00:00
Thomas Moestl
a42a42e9b9 Fix the handling of high PLT entries (> 32764) on sparc64. This requires
additional arguments to reloc_jmpslot(), which is why MI code and MD code
of other platforms had to be changed.

Reviewed by:	jake
Approved by:	re
2002-11-18 22:08:50 +00:00
Matthew Dillon
b08440e568 Correct a bug in the last commit. The whole point of creating a 'done:'
goto target was so the cache could be freed.  So free the cache after
done: rather then before done: (!)

Submitted by:	Gavin Atkinson <gavin@ury.york.ac.uk>
2002-06-10 21:15:50 +00:00
Matthew Dillon
b603db3019 In tracking down an installation seg fault with then openoffice port
Martin Blapp determined that the elf dynamic loader was at fault.  In
particular, the loader uses alloca() to allocate a symbol cache on the
stack.  Normally this would work just fine, but if the loader is called
from a threaded program and the object being loaded is fairly large the
alloca() can blow away the thread stack and effect other nearby thread
stacks as well.  My testing showed that the symbol cache can be as large
as 250KBytes during the openoffice port build and install sequence.  Martin
was able to work around the problem by disabling the symbol cache
(cache = NULL;).  However, this solution is not adequate for commit because
it can cause an enormous cpu burden for applications which do a lot of
dynamic loading (e.g. like konqueror).

The solution is to use anonymous mmap() to temporarily allocate space to
hold the symbol cache.  In testing I found that replacing the alloca()
with mmap() has no observable degredation in performance.

It should be noted that this bug does not necessarily cause an immediate
crash but can instead result in long term corruption and instability in
applications that load modules from threads.  The bug is almost certainly
responsible for some of the instabilities found in konqueror, for example,
and possibly netscape too.

Sleuthing work by: Martin Blapp <mb@imp.ch>
X-MFC after:	Before or after the 4.6 release depending on the release engineers
2002-06-10 18:52:31 +00:00
Doug Rabson
b5393d9f78 Add ia64 support. Various adjustments were made to existing targets to
cope with a few interface changes required by the ia64. In particular,
function pointers on ia64 need special treatment in rtld.
2001-10-15 18:48:42 +00:00
John Polstra
c15e7faad5 Performance improvements for the ELF dynamic linker. These
particularly help programs which load many shared libraries with
a lot of relocations.  Large C++ programs such as are found in KDE
are a prime example.

While relocating a shared object, maintain a vector of symbols
which have already been looked up, directly indexed by symbol
number.  Typically, symbols which are referenced by a relocation
entry are referenced by many of them.  This is the same optimization
I made to the a.out dynamic linker in 1995 (rtld.c revision 1.30).

Also, compare the first character of a sought-after symbol with its
symbol table entry before calling strcmp().

On a PII/400 these changes reduce the start-up time of a typical
KDE program from 833 msec (elapsed) to 370 msec.

MFC after:	5 days
2001-05-05 23:21:05 +00:00
John Polstra
7dbe16fbee When a threads package registers locking methods with dllockinit(),
figure out which shared object(s) contain the the locking methods
and fully bind those objects as if they had been loaded with
LD_BIND_NOW=1.  The goal is to keep the locking methods from
requiring any lazy binding.  Otherwise infinite recursion occurs
in _rtld_bind.

This fixes the infinite recursion problem in the linuxthreads port.
2000-01-29 01:27:04 +00:00
John Polstra
d3980376e8 Add a new function dllockinit() for registering thread locking
functions to be used by the dynamic linker.  This can be called by
threads packages at start-up time.  I will add the call to libc_r
soon.

Also add a default locking method that is used up until dllockinit()
is called.  The default method works by blocking SIGVTALRM, SIGPROF,
and SIGALRM in critical sections.  It is based on the observation
that most user-space threads packages implement thread preemption
with one of these signals (usually SIGVTALRM).

The dynamic linker has never been reentrant, but it became less
reentrant in revision 1.34 of "src/libexec/rtld-elf/rtld.c".
Starting with that revision, multiple threads each doing lazy
binding could interfere with each other.  The usual symptom was
that a symbol was falsely reported as undefined at start-up time.
It was rare but not unseen.  This commit fixes it.
1999-12-27 04:44:04 +00:00
Peter Wemm
7f3dea244c $Id$ -> $FreeBSD$ 1999-08-28 00:22:10 +00:00
John Polstra
962fdc466a Fix a serious performance bug for large programs on the Alpha,
discovered by Hidetoshi Shimokawa.  Large programs need multiple
GOTs.  The lazy binding stub in the PLT can be reached from any of
these GOTs, but the dynamic linker only has enough information to
fix up the first GOT entry.  Thus calls through the other GOTs went
through the time-consuming lazy binding process on every call.

This fix rewrites the PLT entries themselves to bypass the lazy
binding.

Tested by Hidetoshi Shimokawa and Steve Price.

Reviewed by:	Doug Rabson <dfr@freebsd.org>
1999-06-25 02:53:59 +00:00
John Polstra
d5b537d01a Eliminate all machine-dependent code from the main source body and
the Makefile, and move it down into the architecture-specific
subdirectories.

Eliminate an asm() statement for the i386.

Make the dynamic linker work if it is built as an executable instead
of as a shared library.  See i386/Makefile.inc to find out how to
do it.  Note, this change is not enabled and it might never be
enabled.  But it might be useful in the future.  Building the
dynamic linker as an executable should make it start up faster,
because it won't have any relocations.  But in practice I suspect
the difference is negligible.
1999-04-09 00:28:43 +00:00
Doug Rabson
13575fc46f Add alpha support.
Submitted by: John Birrell <jb@cimlogic.com.au> (with extra hacks by me)
Obtained from: Probably NetBSD
1998-09-04 19:03:57 +00:00